Reading Avro DataFiles

Avro Data Files are portable across platforms. You can read the Data Files written by java program from a python program. Data Files carry the schema with them.In this blog we will see how to read a Data File.

To read Avro data file all you need is file name. We will use DataFileStream to iterate through the contents of the file and print them on console. You need to pass InputStream of the target file and instance of DatumReader .

Reading Avro DataFile From HDFS:

Configuration conf = new Configuration();
FileSystem fs=FileSystem.get(conf);
InputStream is=fs.open(new Path(srcUri));
DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
DataFileStream<GenericRecord> dataFileStream =
new DataFileStream<GenericRecord>(is, reader);
GenericRecord record=null;
while(dataFileStream.hasNext()){
record=dataFileStream.next(record);
System.out.println(record);
}
dataFileStream.close();

Read Avro DataFile
Read Avro DataFile

 

Add a Comment

Your email address will not be published. Required fields are marked *