Sunday, March 2, 2014

HBase Interview Questions

    1. What is the similarity between HBASE and RDBMS?
      Both are ACID compliant and can run transactional applications.
    2. What is HBase design based upon?
      HBase design is based upon Google's BigTable
    3. What is the most unique feature supported by HBase?
      HBase supports versioning out-of-the-box
    4. How is versioning implemented in HBase?
      Versioning is implemented using the timestamp field.
    5. What are the different types of compression algorithms supported by HBase?
      Gzip and  Lempel-Ziv-Oberhumer (LZO)
    6. Which compression algorithm comes packaged with HBase?
      Gzip
    7. Why is LZO not packaged with HBase?  How can it be included?
      Due to licensing issues LZO is not packaged with HBase.   It can be downloaded separately.
    8. What is a region?
      A region is a chunk of rows identified by starting key (inclusive) and ending key (exclusive)
    9. How are rows kept in HBase?
      Rows are kept sorted by row key
    10. How are region to region server assignments managed?
      ZooKeeper (a distributed coordination service) manages region assignment to region server.
    11. What are the two special tables in HBase?
      .META and .ROOT
    12. What does .META table store?
      It keeps track of all user tables and which region servers are responsible for serving the regions of those tables.
    13. Does one table map to one region?
      No.  As the size of the table grows, more regions are created and spread across the entire cluster.
    14. How are write operations performed in HBase?
      HBase uses WAL (Write Ahead Log) before persisting to the disk
    15. Is writing to WAL mandatory?
      No writing to WAL is not mandatory.  It is enabled by default.
    16. How can you control WAL setting?
      Writing to WAL can be changed by using setWriteToWAL() method
    17. What is the advantage of disabling WAL?
      Improves performance
    18. What does Bloom filters help determine?
      Bloom filters determine if a column exists for a given row key or if a row key exists at all.
    19. Why are operations that alter column family characteristics expensive?
      HBase creates a new column family with new specification and then copies all the data over from the old column family and then deletes it.
    20. What are the three different running modes supported by HBase?
      Standalone mode, Pseudo-distributed mode, Fully distributed mode

No comments:

Post a Comment