Developer Jul 21, 2022

Do you understand the Video API?

What is video API?

Video API is an interface dedicated to providing audio and video transmission services, mainly divided into two types: static video API and live video API.

Static Video API
Static Video API is an API that provides video file play services. Service providers offer video file cloud storage and CDN distribution services and provide video services through protocol interfaces such as HTTP and RTMP.

For example, YouTube and Instagram use Static Video API.

Live Video API
For example, Live.me, Yalla, and Uplive use the Live Video API.

The static video API is easy to understand and pulls video files from the cloud through the streaming protocol.

The live video API is more complicated. How does it ensure the video can be transmitted to the other end quickly, smoothly, and clearly?

In this article, we will introduce the logic behind the live video API in detail.

What Can The Video API Do?

The live video API application is becoming more extensive with the continuous improvement of network bandwidth and device performance. It makes many scenarios possible, such as:

  1. Live
  2. Online education
  3. Video conferencing
  4. Telemedicine
  5. Video calls
  6. Multiplayer

What Happens After The Video API?

For the live video API, it is necessary to ensure that the video data can complete the end-to-end transmission within 500ms and, simultaneously, guarantee the clarity of the video picture and the flow of the call.

Therefore, the live video API needs to ensure that the audio and video data can realize end-to-end data transmission in a fast, large and stable manner. A complex system is required to ensure the availability of the live video API.

Video API composition

As shown in the figure, the live video API mainly layered the functions of 6 modules:

video API processing flow

1. Audio and video capture

Audio and video capture is the source of audio and video data collected via cameras, microphones, screens, video files, recording files, and other channels.

It involves using color spaces, such as RGB and YUV, and extracting audio features, such as sampling rate, number of channels, bit rate, audio frame, etc.

2. Audio and video preprocessing

The audio and video preprocessing are mainly for business scenarios, and the collected data is processed once and again to optimize the customer experience.

For example:

  • Video data processing for beauty, filters, special effects, etc.
  • Audio data processing includes voice change, reverberation, echo cancellation, noise suppression, volume gain, etc.

3. Audio and video encoding

Audio and video coding ensures that audio and video data can be transmitted quickly and safely on the network.

Commonly used encoding formats are Video encoding format: H264, H265 Audio coding format: OPUS, AAC.

4. Audio and video transmission

Audio and video transmission is the most complex module in the video API. To ensure that the audio and video data can be transmitted to the opposite end quickly, stably, and with high quality in a complex network environment, streaming protocols such as RTP, HTTP-FLV, HLS, and RTMP can be used.

Various anomalies such as data redundancy, loss, out-of-order, flow control, adaptive frame rate, resolution, jitter buffering, etc., must be resolved using multiple algorithms.

So whether video API is worth choosing, we need to focus on whether the manufacturer has outstanding audio and video transmission advantages.

5. Audio and video decoding

Audio and video decoding means that after the audio and video data is transmitted to the receiving end, the receiving end needs to decode the data according to the received data encoding format.

There are two video decoding methods, hardware decoding and software decoding. Software decoding generally uses the open source library FFMpeg. Audio decoding only supports software decoding; use FDK_AAC or OPUS decoding according to the audio encoding format.

6. Audio and video rendering

Audio and video rendering is the last step in the video API processing process. This step seems to be very simple. You only need to call the system interface to render the data to the screen.

However, one must process much logic to align the video picture with the audio. Still, now there is a standard processing method to ensure the synchronization of the audio and video.

How to use ZEGOCLOUD video API

Building a complete set of real-time audio and video systems is complex work. Nevertheless, many Video APIs help us solve the underlying complex operations. We only need to focus on the upper-level business logic.

The following will introduce how to use the ZEGOCLOUD Video API to implement the video calling function.

1.  Implementation process

The following diagram shows the basic process of User A playing a stream published by User B:

Video API call sequence

The following sections explain each step of this process in more detail.

2. Optional: Create the UI

Before creating a ZegoExpressEngine instance, we recommend you add the following UI elements to implement basic real-time audio and video features:

  • A view for local preview
  • A view for remote video
  • An End button
Video Call UI

3. Create a ZegoExpressEngine instance

To create a singleton instance of the ZegoExpressEngine class, call the createEngine method with the AppID of your project.

/** Define a ZegoExpressEngine object */
ZegoExpressEngine engine;

ZegoEngineProfile profile = new ZegoEngineProfile();
/** AppID format: 123456789L */
profile.appID = appID;
/** General scenario */
profile.scenario = ZegoScenario.GENERAL;
/** Set application object of App */
profile.application = getApplication();
/** Create a ZegoExpressEngine instance */
engine = ZegoExpressEngine.createEngine(profile, null);

4. Log in to a room

To log in to a room, call the loginRoom method.

/** create a user */
ZegoUser user = new ZegoUser("user1");


ZegoRoomConfig roomConfig = new ZegoRoomConfig();
/** Token is generated by the user's own server. For an easier and convenient debugging, you can get a temporary token from the ZEGOCLOUD Admin Console */
roomConfig.token = "xxxx";
/** onRoomUserUpdate callback can be received only by passing in a ZegoRoomConfig whose "isUserStatusNotify" parameter value is "true".*/
roomConfig.isUserStatusNotify = true;

/** log in to a room */
engine.loginRoom("room1", user, roomConfig, (int error, JSONObject extendedData)->{
    // (Optional callback) The result of logging in to the room. If you only pay attention to the login result, you can use this callback.
});  

Then, to listen for and handle various events that may happen after logging in to a room, you can implement the corresponding event callback methods of the event handler as needed.

engine.setEventHandler(new IZegoEventHandler() {

    /** Common event callbacks related to room users and streams. */

    /** Callback for updates on the current user's room connection status. */
    @Override
    public void onRoomStateUpdate(String roomID, ZegoRoomState state, int errorCode, JSONObject extendedData) {
        /** Implement the callback handling logic as needed. */
    }

    /** Callback for updates on the status of other users in the room. */
    @Override
    public void onRoomUserUpdate(String roomID, ZegoUpdateType updateType, ArrayList<ZegoUser> userList) {
        /** Implement the callback handling logic as needed. */
    }

    /** Callback for updates on the status of the streams in the room. */
    @Override
    public void onRoomStreamUpdate(String roomID, ZegoUpdateType updateType, ArrayList<ZegoStream> streamList, JSONObject extendedData){
        /** Implement the callback handling logic as needed. */
     }

});

5. Start the local video preview

To start the local video preview,  call the startPreview method with the view for rendering the local video passed to the canvas parameter.

You can use a SurfaceView, TextureView, or SurfaceTexture to render the video.

/**
 *  Set up a view for the local video preview and start the preview with SDK's default view mode (AspectFill).
 *  The following play_view is a SurfaceView, TextureView, or SurfaceTexture object on the UI.
 */
engine.startPreview(new ZegoCanvas(preview_view));

6. Publish streams

To start publishing a local audio or video stream to remote users, call the startPublishingStream method with the corresponding Stream ID passed to the streamID parameter.

/** Start publishing a stream */
engine.startPublishingStream("stream1");

Then, to listen for and handle various events that may happen after stream publishing starts, you can implement the corresponding event callback methods of the event handler as needed.

engine.setEventHandler(new IZegoEventHandler() {
    /** Common event callbacks related to stream publishing. */

    /** Callback for updates on stream publishing status.   */
    @Override
    public void onPublisherStateUpdate(String streamID, ZegoPublisherState state, int errorCode, JSONObject extendedData){
        /** Implement the callback handling logic as needed. */
    }
});

7. Play streams

To start playing remote audio or video stream, call the startPlayingStream method with the corresponding Stream ID passed to the streamID parameter and the view for rendering the video passed to the canvas parameter.

You can use a SurfaceView, TextureView, or SurfaceTexture to render the video.

/**
 *  Start playing a remote stream with the SDK's default view mode (AspectFill).
 *  The play_view below is a SurfaceView/TextureView/SurfaceTexture object on UI.
 */
engine.startPlayingStream("stream1", new ZegoCanvas(play_view));

8. Stop publishing and playing streams

To stop publishing a local audio or video stream to remote users, call the stopPublishingStream method.

/** Stop publishing a stream */
engine.stopPublishingStream();

If local video preview is started, call the stopPreview method to stop it as needed.

/** Stop local video preview */
engine.stopPreview();

To stop playing a remote audio or video stream, call the stopPlayingStream method with the corresponding stream ID passed to the streamID parameter.

/** Stop playing a stream*/
engine.stopPlayingStream(streamID);

9. Log out of a room

To log out of a room, call the logoutRoom method with the corresponding room ID passed to the roomID parameter.

/** Log out of a room */
engine.logoutRoom("room1");

Tags

ZEGOCLOUD

Building stable and high-quality cloud streaming services for real-time audio and video communications.