Can anyone help me to fix this? Thanks.

My environment is:

    OS=> "ubuntu 12.04.1 LTS 64bits" on "VirtualBox 4.1.18"

    software=> Lily1.3 + Solr4.1.0 + CDH4

    Cluster => Singal node

After import *.txt file using Lily repository API (code modified from cr/process/client project), I can get the blob file from REST API and the content is correct. But from solr, I found a lot of content blocks are lost, even small size of file (original file size 9.4k => get from REST API:9.4k, but get from solr query:9.2k) So, those missing content can't be searched by solr.

solr schema.xml

                    maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>



lily indexconfig.xml

data will missing in such block < where start with << and a character, end with >

Is this a special pattern for solr? How can I fix it?

Please help me to find why data will missing in solr. Thanks.

