PostgreSQL: could not open file “/var/lib/pgsql/data/pg_clog/0726”: No such file or directory

when selecting a specific date from a table, at times due to database corruption Postgres might report the file was not found.

hsphere=# select * from TABLE;
ERROR:  could not access status of transaction 1918986094
DETAIL:  could not open file "/var/lib/pgsql/data/pg_clog/0726": No such file or directory

In Such instances, we will can re-create the file using the below dd command (note: data loss on the missing file)

dd if=/dev/zero of=/var/lib/pgsql/data/pg_clog/0726 bs=256k count=1

Note: Ensure the permissions on the file that was re-created are set appropriately. creating the blank file means that the data that was originally on it might actually go missing. The respective records will need to be re-created or its relevant records (constraints etc) will need to be removed from the DB.

Statement diagnostic data from driver is XX002:0:7:ERROR: index “pk_index_name” contains corrupted page at block xxxx;

on the vpxd.log:
--> Error while executing the query
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [VdbStatement] Connection diagnostic data from driver is HY000:0:110:
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [VdbStatement] Bind parameters:
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [VdbStatement] [0]datatype: 1, size: 4, arraySize: 0
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [VdbStatement] value = 172237
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [Vdb::IsRecoverableErrorCode] Unable to recover from XX002:7
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [Vdb::IsRecoverableErrorCode] Unable to recover from HY000:110
2018-09-25T21:17:09.351Z error vpxd[7F3FC958B800] [Originator@6876 sub=Default] [VdbStatement] SQLError was thrown: "ODBC error: (XX002) - ERROR: index "pk_vpx_alarm_runtime" contains corrupted page at block 5877;
--> Error while executing the query" is returned when executing SQL statement "DELETE FROM VPX_ALARM_RUNTIME WHERE ENTITY_ID=?"
2018-09-25T21:17:09.355Z error vpxd[7F3FC958B800] [Originator@6876 sub=Daemon] Unhandled exception: Error[VdbODBCError] (-1) "ODBC error: (XX002) - ERROR: index "pk_vpx_alarm_runtime" contains corrupted page at block 5877;
--> Error while executing the query" is returned when executing SQL statement "DELETE FROM VPX_ALARM_RUNTIME WHERE ENTITY_ID=?"
2018-09-25T21:17:09.355Z info vpxd[7F3FC958B800] [Originator@6876 sub=SupportMgr] Wrote uptime information
(END)

Resolution:

Take a snapshot of the vcsa
Export affected table:

VCDB=# copy (select * from vpx_alarm_runtime) to '/tmp/vpx_alarm_runtime_select_.csv' with CSV DELIMITER ',';
COPY 3854
Drop affected constrain.
VCDB=# alter table vpx_alarm_runtime drop CONSTRAINT PK_VPX_ALARM_RUNTIME;

Recreate constrain

alter table vpx_alarm_runtime add constraint PK_VPX_ALARM_RUNTIME primary key (ENTITY_ID, ALARM_ID,
EXPRESSION_NAME)
recreate Table: VPX_ALARM_RUNTIME
/*==============================================================*/
/* Table: VPX_ALARM_RUNTIME */
/*==============================================================*/
create table VPX_ALARM_RUNTIME (
ENTITY_ID INTEGER not null,
ALARM_ID INTEGER not null,
ENTITY_TYPE INTEGER,
EXPRESSION_NAME VARCHAR(440) not null,
STATE_VALUE VARCHAR(255),
METRIC_VALUE INTEGER,
CREATED_TIME TIMESTAMP,
STATUS_VALUE VARCHAR(50),
EVENT_KEY INTEGER null,
constraint PK_VPX_ALARM_RUNTIME primary key (ENTITY_ID, ALARM_ID,
EXPRESSION_NAME)
);
 

Connecting to VMware appliance postgres/PSQL instance from an external computer/pgadmin

By default, the postgres instance on vCenter/vSphere replication..etc.. are configured to not accept connections from a computer on the network. On this  post, I will show you how to re-configure this to allow connections from an external box for tools like PGadmin etc.

Note: Depending on the appliance, the postgres, configuration files/paths might be different. On this post, we will search for the configuration and then change the respective file.

Start by ssh into the appliance.

Type the below command to search for the configuration file: postgresql.conf

find / -iname postgresql.conf

take a copy of the configuration.

cp /storage/db/vpostgres/postgresql.conf /storage/db/vpostgres/postgresql.conf.backup

Edit the file

vi /storage/db/vpostgres/postgresql.conf

Look for the line that says “listen_addresses = ‘XXXX”
In some cases, this will be hashed out, remove the hash. and replace local host with *

Save the configuration file (key combination: “Esc” + “:” and then type in “wq!”

Search for the Postgres configuration file

find / -iname pg_hba.conf

backup the configuration file

cp /storage/db/vpostgres/pg_hba.conf /storage/db/vpostgres/pg_hba.conf.bak

Edit the file

vi /storage/db/vpostgres/pg_hba.conf

Look for the below and replace this with the your IP subnet

host all all 192.168.1.0/24 trust   <———————————–From the below putty, you can see that I am on a 192.168.1.x subnet

The method is set to trust (not recommended) as I did not want to log into the DB with a password

Save the configuration file (key combination: “Esc” + “:” and then type in “wq!”

restart vmware-postgres service

service vmware-vpostgres restart

For vCenter server 6.5

service-control --vmware-vpostgres restart

Conform postgres port number and if it is listening to (vSPhere Replication appliance listens to a different port! it is best to know which port you need to connect to when accessing from an external box)

netstat -anop | grep postgres

From the above, we know the port is 5432

Launch pgadmin and add a new server

Also note that in most cases, the db credentials is stored in certain configuration files like

  • VCDB.properties
find / -iname vcdb.properties
Cat  /etc/vmware-vpx/vcdb.properties
  • or the .pgpass from the home directory
ls -ltha ~/

cat ~/.pgpass