Monday, 22 February 2016

Apache Lucene - Index File Formats



    Apache Lucene - Index File Formats

    https://lucene.apache.org/core/5_0_0/core/org/apache/lucene/codecs/lucene50/package-summary.html

    Elasticsearch transaction log:
    http://javadoc.kyubu.de/elasticsearch/v2.1.0/org/elasticsearch/index/translog/Translog.html

    Wednesday, 10 February 2016

    Citations from Hello Statup by Yevgeniy Brikman

    10x developers

    Studies shown:

    • the ratio of initial coding time between the best and the worst programers was about 20 to 1
    • the ratio of debugging times over 25 to 1
    • of program size 5 to 1
    • and of program execution speed 10 to 1

    Citations from Hello Statup by Yevgeniy Brikman

    Don't leave the interview without knowing:

    • What are the expectations for the role?
    • What does success look like for this job?
    • Who will my manager?
    • What projects will I work on?
    • What is the tech stack?
    • What are the hours like? How many of them are spent coding? In meeting?
    • How do you build and release code?
    • What are the company's mission and values?
    • What is the office like?
    • What's your favorite and least favorite part of working here? 

    Thursday, 29 October 2015

    Elasticsearch 2.0 released

    Released Elasticsearch 2.0. A major milestone and achievement of the whole team, and wonderful contributions from the community. New type of aggregations called pipeline aggs, simplified query DSL by merging query and filter concepts, better compression options, hardened security by enabling security manager, hardening of FS behavior (fsync, more checksums, atomic renames), performance, consistent mapping behavior, and many more. Also, it bundles Lucene 5 release, which includes numerous improvements.

    Saturday, 26 September 2015

    tf–idf - best known weighting scheme in information retrieval

    tf–idf, short for term frequency–inverse document frequency.
    Best known weighting scheme in information retrieval
    Good explanation: https://class.coursera.org/nlp/lecture/187

    Friday, 8 May 2015

    I have just released Allegro OpenSource: Elasticsearch reindex tool
    http://allegrotech.io/elasticsearch-reindex-tool.html

    Friday, 6 March 2015

    Logstash roadmap

    Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). Speaking of searching, logstash comes with a web interface for searching and drilling into all of your logs.
    It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
    Now we have also Logstash roadmap