Kafka is a distributed pub/sub broker which can scale horizontally. Kafka broker is designed to work in clusters , where you will run multiple Kafka brokers on different
Hadoop has many standard compression codecs available, namely DEFLATE (.deflate) ,gzip (.gz), bzip2 (.bz2) ,LZO (.lzo),LZ4 (.lz4),Snappy (.snappy) . Only bzip2 is splittable , it very important for
Avro Supports both primitive and complex data types Primitive data types null, boolean, int, long, float, double, string, bytes Complex data types array – ordered collection of objects
Most common action type you will find in oozie workflow is <map-reduce> action type. In this blog we will see how to define a map-reduce action type. the
Your reading schema doesn’t has to be same as that of the writing schema. You can add new fields or remove the existing fields(projection). If a new field