"can i predict how large my zend framework index will be? (and some quick q:s)" Code Answer


solr is basically an apache tomcat container that implements a rest interface to query an apache lucene index. yes, you need to be able to run a java application on your web server. this is an issue for you to work out with your hosting provider.

clients using your web app don't need to run java. your php app could make a rest query to the solr service and format the results in html. a client sees only the html output; it never needs to know that the data came from a service implemented in java.

zend_search_lucene is a pure-php implementation that is supposed to work identically to apache lucene. the zend solution even uses an identical index file format. so storage-wise they should be equal.

i used java lucene to index the data dump (october 2009). i indexed 1.5 million rows, including about 1 gig of text data. the lucene index was 1323 mb, whereas the mysql fulltext index of the same data was only 466 mb.

using sql like predicates in lieu of any fulltext indexing solution requires no space of course, because it cannot make use of a conventional index anyway. but in my tests using like was about 200 times slower than java lucene, which was in turn about 40% slower than a mysql fulltext index on the same data.

see my recent presentation about fulltext indexing solutions with mysql:


it's not surprising that it can't match the performance and scalability of the java lucene technology. php's advantage as a language is increasing development efficiency, not runtime efficiency.

update: i just tried creating an index using zend_search_lucene. creating an index is far slower with php than with the java lucene technology, so i only indexed 10,000 documents. this took almost 15 minutes, which would make it take about 36 hours to index the whole collection. compare this to java lucene, which in my test indexed the full collection of 1.5 million documents in under 7 minutes.

the size of the index i created with zend_search_lucene is 8.75 mb. extrapolating this 150x, i estimate the full index would be 1312.5 mb. so i conclude that zend_search_lucene creates an index of about the same size as the index produced by java lucene. this is as expected.

By Wolf5 on May 27 2022

Answers related to “can i predict how large my zend framework index will be? (and some quick q:s)”

Only authorized users can answer the Search term. Please sign in first, or register a free account.