Organizational Research By

Surprising Reserch Topic

cannot store euro sign into lob string property with hibernate postgresql

cannot store euro sign into lob string property with hibernate postgresql  using -'hibernate,postgresql,character-encoding,lob'

I am having trouble writing and reading back special characters like the Euro-sign (€) into LOB String properties in PostgreSQL 8.4 with Hibernate 3.6.10.

What I know is that PostgreSQL provides two distinct ways to store large character objects in a column of a table. They can be stored either directly into that table column or indirectly in a separate table (it's actually called pg_largeobject). In the latter case, the column holds a reference (OID) to the row in pg_largeobject.

The default behaviour in Hibernate 3.6.10 is the indirect OID approach. However, it is possible to add an extra annotation @org.hibernate.annotations.Type(type="org.hibernate.type.TextType") to the Lob property to get the direct storage behaviour.

Both approaches work fine, except for the moment that I want to work with special characters like the Euro sign (€). In that case the direct storage mechanism keeps working, but the indirect storage mechanism breaks.

I'd like to demonstrate that with an example. I created a test entity with 2 @Lob properties. One follows the direct storage principle, the other the indirect storage:

@Column(name = "CLOB_VALUE_INDIRECT_STORAGE", length = 2147483647)
public String getClobValueIndirectStorage()


@Column(name = "CLOB_VALUE_DIRECT_STORAGE", length = 2147483647)
public String getClobValueDirectStorage()

If I create an entity, populate both properties with the Euro sign and then persist it towards the database I see the following when I do a SELECT I see

 id | clob_value_direct_storage | clob_value_indirect_storage
  6 | €                         | 910579                     

If I then query the table pg_largeobject I see:

  loid  | pageno | data
 910579 |      0 | \254

The 'data' column of pg_largeobject is of type bytea, which means that the information is stored as raw bytes. The expression '\254' represents one single byte and in UTF-8 represents the character '¬'. This is exactly the value that I get back when I load the entity back from the database.

The Euro sign in UTF-8 consists of 3 bytes, so I would have expected the 'data' column to have 3 bytes and not 1.

This does not only occur for the Euro sign, but for many special characters. Is this a problem in Hibernate? Or the JDBC driver? Is there a way I can tweak this behaviour?

Thanks in advance,
Kind regards,
Franck de Bruijn

asked Oct 19, 2015 by sumit_jaiswalmca
0 votes

Related Hot Questions

1 Answer

0 votes
answered Oct 19, 2015 by rajeshujade