Protobuf、Avro

使用场景

  • hadoop大文件块序列化传输
  • Spark-netty Shuffe机制序列化传输
  • 大文件块序列化传输,大Payload的RPC序列化

什么是序列化:将对象或者文件转换成0和1组成的字节数组,如[1,0,1,1,1,0],然后就可以用通过网络进行传输,不同序列化工具算法数据结构不同。性能,结果大小不同,具体使用什么样的序列化需要结合场景。

为什么用 protobuf、Arvo

  • 对比JSON和Java原生,序列化性能更高,结果体积更小。适用于大数据场景,如Shuffe,RPC,数据容灾冗余。
  • 跨语言,如果你使用Java原生序列化,其他语言是无法反序列化的。

Protobuf

既然是工具,用起来其实很简单。使用步骤。

  1. 定义以 .proto 为结尾的Schema文件,这个文件申明了被序列化数据的格式,反序列化端只需有同样内容.proto文件即可完成反序列化。
  2. 编译.proto为Java源码
  3. 使用Protobuf JavaSDK进行数据传输。

Protobuf编译器下载

github,下载自己系统对应版本即可。如我是Win64,就下载protoc-3.11.2-win64.zip,解压后将bin目录添加到环境变量。

pom .proto2

<!--核心SDK-->
<dependency>
  <groupId>com.google.protobuf</groupId>
  <artifactId>protobuf-java</artifactId>
  <version>3.10.0</version>
</dependency>
<!--JSON工具-->
<dependency>
  <groupId>com.google.protobuf</groupId>
  <artifactId>protobuf-java-util</artifactId>
  <version>3.10.0</version>
</dependency>

新建 addressbook.proto

idea新建Proto需要安装插件 ProtoBuf Support

#申明proto语法版本为2,hadoop3.X版本中用的是proto2
syntax = "proto2";
#申明proto文件所属的包,类似于java的包管理。
package tutorial;
#声明proto文件编译产生java文件输出包,输出java类名
option java_package = "com.example.tutorial";
option java_outer_classname = "AddressBookProtos";
#声明消息格式
message Person {
# required必须要求的,为空则会报错
# optional可选的
# repeated 可重复的,重复次数从0到无限
# 赋值为 1 或 2代表其重要程度,对于不常用的可以标记为更大,大到15
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }
  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }
  repeated PhoneNumber phones = 4;
}
message AddressBook {
  repeated Person people = 1;
}

编译生成Java源文件

protoc -I={你proto文件放置的文件夹,如E:\ideaproject\test\src\main\proto\} --java_out={java输出目录,如E:\ideaproject\test\src\main\java}  {proto文件位置,如E:\ideaproject\test\src\main\proto\addressbook}

API操作

写数据,比较简单,API调用起来。

class Writer {
    // This function fills in a Person message based on user input.
    static AddressBookProtos.Person PromptForAddress(BufferedReader stdin,
                                                     PrintStream stdout) throws IOException {
        AddressBookProtos.Person.Builder person = AddressBookProtos.Person.newBuilder();
        stdout.print("Enter person ID: ");
        person.setId(Integer.valueOf(stdin.readLine()));
        stdout.print("Enter name: ");
        person.setName(stdin.readLine());
        stdout.print("Enter email address (blank for none): ");
        String email = stdin.readLine();
        if (email.length() > 0) {
            person.setEmail(email);
        }
        while (true) {
            stdout.print("Enter a phone number (or leave blank to finish): ");
            String number = stdin.readLine();
            if (number.length() == 0) {
                break;
            }
            AddressBookProtos.Person.PhoneNumber.Builder phoneNumber =
                    AddressBookProtos.Person.PhoneNumber.newBuilder().setNumber(number);
            stdout.print("Is this a mobile, home, or work phone? ");
            String type = stdin.readLine();
            if (type.equals("mobile")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.MOBILE);
            } else if (type.equals("home")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.HOME);
            } else if (type.equals("work")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.WORK);
            } else {
                stdout.println("Unknown phone type.  Using default.");
            }
            person.addPhones(phoneNumber);
        }
        return person.build();
    }
    // Main function:  Reads the entire address book from a file,
    //   adds one person based on user input, then writes it back out to the same
    //   file.
    public static void main(String[] args) throws Exception {
        if (args.length != 1) {
            System.err.println("Usage:  AddPerson ADDRESS_BOOK_FILE");
            System.exit(-1);
        }
        AddressBookProtos.AddressBook.Builder addressBook = AddressBookProtos.AddressBook.newBuilder();
        // Read the existing address book.
        try {
            addressBook.mergeFrom(new FileInputStream(args[0]));
        } catch (FileNotFoundException e) {
            System.out.println(args[0] + ": File not found.  Creating a new file.");
        }
        // Add an address.
        addressBook.addPeople(
                PromptForAddress(new BufferedReader(new InputStreamReader(System.in)),
                        System.out));
        // Write the new address book back to disk.
        FileOutputStream output = new FileOutputStream(args[0]);
        addressBook.build().writeTo(output);
        output.close();
    }
}

读数据,也是比较简单的。

import com.example.tutorial.AddressBookProtos.AddressBook;
import com.example.tutorial.AddressBookProtos.Person;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.PrintStream;
class ListPeople {
  // Iterates though all people in the AddressBook and prints info about them.
  static void Print(AddressBook addressBook) {
    for (Person person: addressBook.getPeopleList()) {
      System.out.println("Person ID: " + person.getId());
      System.out.println("  Name: " + person.getName());
      if (person.hasEmail()) {
        System.out.println("  E-mail address: " + person.getEmail());
      }
      for (Person.PhoneNumber phoneNumber : person.getPhonesList()) {
        switch (phoneNumber.getType()) {
          case MOBILE:
            System.out.print("  Mobile phone #: ");
            break;
          case HOME:
            System.out.print("  Home phone #: ");
            break;
          case WORK:
            System.out.print("  Work phone #: ");
            break;
        }
        System.out.println(phoneNumber.getNumber());
      }
    }
  }
  // Main function:  Reads the entire address book from a file and prints all
  //   the information inside.
  public static void main(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println("Usage:  ListPeople ADDRESS_BOOK_FILE");
      System.exit(-1);
    }
    // Read the existing address book.
    AddressBook addressBook =
      AddressBook.parseFrom(new FileInputStream(args[0]));
    Print(addressBook);
  }
}

protocbuf用起来还是非常容易,只是需要额外的Proto文件描述数据的Schema。

Avro

基本和Protobuf一致,需要定义Schema文件,然后进行API操作。

https://juejin.im/post/5e2ac7256fb9a02fda4f4c2f

「点点赞赏,手留余香」

    还没有人赞赏,快来当第一个赞赏的人吧!
0 条回复 A 作者 M 管理员
    所有的伟大,都源于一个勇敢的开始!
欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论