Organizational Research By

Surprising Reserch Topic

cannot store euro sign into lob string property with hibernate postgresql


cannot store euro sign into lob string property with hibernate postgresql  using -'hibernate,postgresql,character-encoding,lob'

I am having trouble writing and reading back special characters like the Euro-sign (€) into LOB String properties in PostgreSQL 8.4 with Hibernate 3.6.10.

What I know is that PostgreSQL provides two distinct ways to store large character objects in a column of a table. They can be stored either directly into that table column or indirectly in a separate table (it's actually called pg_largeobject). In the latter case, the column holds a reference (OID) to the row in pg_largeobject.

The default behaviour in Hibernate 3.6.10 is the indirect OID approach. However, it is possible to add an extra annotation @org.hibernate.annotations.Type(type="org.hibernate.type.TextType") to the Lob property to get the direct storage behaviour.

Both approaches work fine, except for the moment that I want to work with special characters like the Euro sign (€). In that case the direct storage mechanism keeps working, but the indirect storage mechanism breaks.

I'd like to demonstrate that with an example. I created a test entity with 2 @Lob properties. One follows the direct storage principle, the other the indirect storage:

@Basic
@Lob
@Column(name = "CLOB_VALUE_INDIRECT_STORAGE", length = 2147483647)
public String getClobValueIndirectStorage()


and

@Basic
@Lob
@org.hibernate.annotations.Type(type="org.hibernate.type.TextType")
@Column(name = "CLOB_VALUE_DIRECT_STORAGE", length = 2147483647)
public String getClobValueDirectStorage()


If I create an entity, populate both properties with the Euro sign and then persist it towards the database I see the following when I do a SELECT I see

 id | clob_value_direct_storage | clob_value_indirect_storage
----+---------------------------+----------------------------
  6 | €                         | 910579                     


If I then query the table pg_largeobject I see:

  loid  | pageno | data
--------+--------+------
 910579 |      0 | \254


The 'data' column of pg_largeobject is of type bytea, which means that the information is stored as raw bytes. The expression '\254' represents one single byte and in UTF-8 represents the character '¬'. This is exactly the value that I get back when I load the entity back from the database.

The Euro sign in UTF-8 consists of 3 bytes, so I would have expected the 'data' column to have 3 bytes and not 1.

This does not only occur for the Euro sign, but for many special characters. Is this a problem in Hibernate? Or the JDBC driver? Is there a way I can tweak this behaviour?

Thanks in advance,
Kind regards,
Franck de Bruijn
    
asked Oct 13, 2015 by sachin wagh
0 votes
9 views



Related Hot Questions



Government Jobs Opening


...