This question has been flagged
3 Replies
8094 Views

Hi - we have a crashed Openerp 7 db adn where trying to extract a pdf file from ir_attachment.dbdatas

I have tried directly via SQL but although I can retrieve something it appears to be "Gobbledygook!!

Is there a way of extracting files from the db?

copy 
(SELECT 
db_datas
FROM
ir_attachment
WHERE
name='Invoice_SAJ_2016_0964_.pdf')
to '/tmp/Invoice.pdf' (FORMAT "binary");

Any advice please.

If I open each file in a text editor each file starts with "PGCOPY" - a clue maybe?

Avatar
Discard

Hi Cameron,

Querying the database will not work because the pdf file is encoded by openerp. You have to use some python code to extract the pdf file:

Please fix the python code accordingly. I dunno how to paste and format a code block in this forum. :)

Formated Code: http://paste.ofcode.org/hecdAnzV2xKXnL238r8Scm

import psycopg2

import base64

def main():

# Try to connect

try:

conn=psycopg2.connect("dbname='yourdatabase' user='openerp' password='openerp'")

except:

print "unable to connect to database."

cur = conn.cursor()

try:

cur.execute("""select

a.name

, a.create_date

, regexp_replace(a.description , E'[\\n\\r]+', ' ', 'g' ) description

, a.datas_fname

, a.id

, a.db_datas --binary attachment encoded

from ir_attachment a

where a.datas_fname is not null

order by id

limit 10;

""")

except:

print "Error querying Postgresql"

rows = cur.fetchall()

try:

for row in rows:

print row[0] # name

print row[1] # description

print row[2] # datas_fname

print row[3]

#save pdf to disk

#invalid pdf. It's encoded in base64 by openerp

f = open('c:/Temp/file.pdf', 'wb')

f.write(row[5])

f.close()

#valid pdf.

f= open('c:/Temp/file_decode.pdf', 'wb')

#we have to decode first

f.write(base64.b64decode(row[5]))

f.close()

except ValueError:

print ValueError

if __name__ == '__main__':

main()

Greetings.

Author

Jamie - :) I cannot thank you enough. Worked a treat. Thank you, thank you, thank you :)

Author Best Answer

This is my reHASH of Sir 'Jaime Vasquez' above code, I tried converting his comment to an answer but kept failing.

Jaime, please forgive my poor rehash of your coding, I'm still in the "Jowels flapping in the wind, vertical acceleration learning phase" :)


import psycopg2

import base64

def main():

#Dataase details

host='192.168.0.100'

port= '5432'

dbname='CrashedDbName'

user='openerp'

password='LittlePigLetMeIn'

#Database connection

try:

conn = psycopg2.connect ("host='"+host+"' port= '"+port+"' dbname= '"+dbname+"' user= '"+user+"' password= '"+password+"'")

cur = conn.cursor()

print "Connected"

except:

print "Unable to connect to database."

return

try:

#Invoice range to extract

for Invoice in range (768,967):

SearchString = "SELECT a.name, description, a.create_date, a.datas_fname, a.id, a.db_datas FROM ir_attachment a WHERE a.datas_fname = '"+"INVSAJ20160"+str(Invoice)+".pdf.pdf"+"';"

cur.execute(SearchString)

rows = cur.fetchall()

#Save Invocies to file

for row in rows:

if row[5]:

#valid pdf

#Save path

f= open('C:\Users\Me\Documents\Invoices\INVSAJ20160'+str(Invoice)+'.pdf', 'wb')

#We have to decode first

f.write(base64.b64decode(row[5]))

f.close()

except:

print ValueError

return

if __name__ == '__main__':

main()

Avatar
Discard