Hadoop has many standard compression codecs available, namely DEFLATE (.deflate) ,gzip (.gz), bzip2 (.bz2) ,LZO (.lzo),LZ4 (.lz4),Snappy (.snappy) . Only bzip2 is splittable , it very important for
Avro Supports both primitive and complex data types Primitive data types null, boolean, int, long, float, double, string, bytes Complex data types array – ordered collection of objects
Most common action type you will find in oozie workflow is <map-reduce> action type. In this blog we will see how to define a map-reduce action type. the
Your reading schema doesn’t has to be same as that of the writing schema. You can add new fields or remove the existing fields(projection). If a new field
Avro Data Files are portable across platforms. You can read the Data Files written by java program from a python program. Data Files carry the schema with them.In