All you want, to get started with GStreamer in Python

Sahil Chachra
7 min readMay 29, 2022

--

Source

This blog helps you get started with GStreamer. It will cover Introduction to GStreamer, Installation process (For Ubuntu based distros), Basic terms in GStreamer, Sample pipelines and What few of the components in the pipeline mean . The blog will also help you use GStreamer in OpenCV and also show you some demos.

Contents

  1. Introduction
  2. Installation
  3. Basic Terms
  4. Example pipelines
  5. GStreamer & OpenCV
  6. References

Introduction

What is GStreamer?

GStreamer is open source, cross platform, pipeline based framework for multimedia. Gstreamer is used in Media players, Video/Audio editors, Web browsers, Streaming servers, etc. They rely heavily on threads to process the data.

From official docs — “The GStreamer core function is to provide a framework for plugins, data flow and media type handling/negotiation. It also provides an API to write applications using the various plugins.

GStreamer helps build pipeline workflows in which it reads a file in one format, process them (resize, rescale, add filters) and then export it into another format. Each components in Gstreamer is plug and play. For example, in your pipeline, you can clip, crop, transcode and merge audio video from different source using just Gstreamer in command line!

By using GStreamer in your Computer Vision pipeline, for example, you will be able to convert your input stream from one format to another, resize and scale your input by just one line of command in Gstreamer before you pass it to the model! You can even send your inferred output (same frames) in format required by some source by converting/encoding/resizing/rescaling it on the go in the Gstreamer pipeline itself.

Architecture in brief

GStreamer Architecture (Source — Docs)

GStreamer core framework is the heart of the design. It provides data transfer, primitives for negotiating data types, communication channels for the applications to talk and synchronises media.

Yellow boxes on the top row are actual applications. Blue box on the top layer is GStreamer tools. The bottom layer are plugins.

Installation (Ubuntu/PopOs/Mint)

a. Make sure you have gcc and cmake installed.

b. Install Anaconda or Python Virtual environment

c. Follow this blog(personally used this blog) to then install all the dependencies (such as codec files and other libraries) and build OpenCV with GStreamer.

Note — OpenCV by default won’t be able to leverage GStreamer if already installed. We need to build OpenCV with GStreamer to use it in our code.

Basic Terms

  1. What is GstElement?

GstElement object is the basic building block for media pipeline. All the elements such as decoder, encoder, demux which we see as a black box, have been derived from GstElement.

Elements (such as sink, source or filter) have properties which are used to modify their behaviour. They also have signals which help them execute a function call on the element.

To know about properties of elements, type gst-inspect-1.0 name_of_element. Example :-

gst-inspect-1.0 autovideosink

2. What is a pipeline in GStreamer?

In simple terms, we take in several components of GStreamer, such as input video source, video decoder and output source, put them together one after the other! That’s it! For example :-

From docs — Display only the video portion of an MPEG-1 video file, outputting to an X display window:

gst-launch-1.0 filesrc location=videofile.mpg ! dvddemux ! mpeg2dec ! xvimagesink

Each element is seperated by ‘!’ and has spaces before and after ‘!’.

3. What is source and sink?

Source — Source elements are those which can only generate data. Such as video file or an IP camera.

Sink — Sink elements are end points in a pipeline. For example, Output video playback, soundcard playback, screen, disk writing are sink elements.

4. Filters

Filters and filter-like elements have both input and output. They recieve data as well as send the data after some kind of processing.

5. What are source and sink pads?

Consider a filter, which resizes the video frame. Now, it will take video frame as input and give resized video frame as output. The point where it takes in an input is called sink pad and from the point where it send out processed data is called source pad.

Source — Docs

Sink pad is always on the left and Source pad is always on the right, be it any GstElement.

6. State of the Elements

NULL — Deactivated element. No resources has been allocated to the element. It is denoted by GST_STATE_NULL and it is a default state for an element.

READY — All the required resources have been allocated and is ready to process. It is denoted by GST_STATE_READY. In this state the stream is not yet opened.

PAUSED — Denoted by GST_STATE_PAUSED. An element has a stream opened, but is not processing it actively. Quoting from docs — “Elements going into the PAUSED state should prepare themselves for moving over to the PLAYING state as soon as possible.”

PLAYING — In this state, the element actively process the data. It is denoted by GST_STATE_PLAYING.

GStreamer Core handles the changing of the states of the elements automatically.

Example pipelines

  1. The first pipeline is the Hello world for GStreamer
gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink

gst-launch-1.0 = Build and launch a pipeline

videotestsource = sample video from GStreamer examples

videoconver = Converts video frames to multiple formats

autovideosink = automatically detects an appropriate video sink to use

2. Adding a capability to the pipeline

gst-launch-1.0 videotestsrc ! video/x-raw, format=BGR ! autovideoconvert ! ximagesink

video/x-raw, format=BGR is a capability of videotestsrc. We are converting the input frames to BGR and then sending it to autovideoconvert.

autovideoconvert automatically converts the format of the video which is supported by the next element in the pipeline.

3. Setting width, height and framerate

gst-launch-1.0 videotestsrc ! video/x-raw, format=BGR ! autovideoconvert ! videoconvert ! video/x-raw, width=640, height=480, framerate=1/2 ! ximagesink

Here we change the height, width and framerate of the input video before sending it to the display. ximagesink is display. By giving Framerate as 1/2, it plays 1 frame every 2 second.

4. Using phone’s camera as IP cam

Command is :-

sudo gst-launch-1.0 rtspsrc location=rtsp://192.168.1.7:8080/h264_ulaw.sdp latency=10 ! queue ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! videoscale ! video/x-raw,width=640,height=480 ! ximagesink

Let’s break down the elements and see what each of them means.

rtspsrc = Input source is RTSP. Location is property of rtspsrc element. We define the path of input source which is in this case an IP address. Latency is also a property. From docs “For pipelines with live sources, a latency is introduced, mostly because of the way a live source works”. Read more about Latency in the docs.

queue = queue creates a buffer such that the input data from the source is stored and the next element would pick the data from the queue and process at it’s own pace. You can read more about it here.

rtph264depay = H264 video from RTP packets

h264parse = Parses h264 encoded frames in a way such that avdec_h264 can understand

avdec_h264 = Decodes h264 formatted data

videoconvert = Ensures compatibility between previous element and next element by converting frames from one format to other

videoscale = Resized video. video/x-raw is capability of videoscale. Height and width are properties.

ximagesink = Displays output on screen

NOTE : Properties are added with the element by giving a space after the element and Capabilities are added like a element, seperating it with a ‘!’.

GStreamer & OpenCV

I’ll show a sample code on how to use GStreamer pipeline as input to OpenCV

import cv2gstreamer_str = "sudo gst-launch-1.0 rtspsrc location=rtsp://192.168.1.5:8080/h264_ulaw.sdp latency=100 ! queue ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! videoscale ! video/x-raw,width=640,height=480,format=BGR ! appsink drop=1"cap = cv2.VideoCapture(gstreamer_str, cv2.CAP_GSTREAMER)while(cap.isOpened()):    ret, frame = cap.read()    if ret:        cv2.imshow("Input via Gstreamer", frame)        if cv2.waitKey(25) & 0xFF == ord('q'):            break        else:            breakcap.release()cv2.destroyAllWindows()

References

  1. Installation Guide — Medium Article
  2. Understanding GStreamer for Absolute Beginners — YouTube
  3. Introduction To GStreamer — YouTube
  4. GStreamer documentation — Docs

--

--

Sahil Chachra

AI Engineer @ SparkCognition| Applied Deep Learning & Computer Vision | Nvidia Jetson AI Specialist