Avro Data Files are portable across platforms. You can read the Data Files written by java program from a python program. Data Files carry the schema with them.In this blog we will see how to read a Data File.
To read Avro data file all you need is file name. We will use DataFileStream to iterate through the contents of the file and print them on console. You need to pass InputStream of the target file and instance of DatumReader .
Reading Avro DataFile From HDFS:
Configuration conf = new Configuration();
InputStream is=fs.open(new Path(srcUri));
DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
DataFileStream<GenericRecord> dataFileStream =
new DataFileStream<GenericRecord>(is, reader);