Saving Your Analog Data from Oblivion

Saving Your Analog Data from Oblivion

Author(s):

If you have old VHS tapes or audio cassettes lying around, the hardware to play these analog formats is becoming more difficult to find. Here's how to convert those old analog treasures to digital format for future enjoyment.

Transferring VHS tapes, audio cassettes, and other analog home media formats to a digital format, such as Ogg or Matroska, can be a complex and expensive process with archival-grade conversions. In this article, I show you a simple and inexpensive method for digitizing your VHS tapes that is perfect for personal use.

Shopping List

To convert a VHS tape to a digital format, you need two pieces of equipment: a playback device for the original medium and a capture device to read the playback device's output. In addition, you need transcoding software to process the data that the capture device retrieves from the playback device. For a list of what I used in this project, see the "Software and Hardware Requirements" box.

Software and Hardware Requirements

Hardware:

  • Firstline VCR-602
  • USBTV007 EasyCAP
  • SCART to RCA adaptor

Software:

  • GStreamer 1.6.4
  • FFmpeg 3.2.4
  • SoX 14.4.2

For the playback device, you can find a used VCR for less than EUR100 online ($100-$200+ for NTSC/PAL/multisystem). Keep in mind that your chosen playback device must match the color encoding system of the VHS tapes you intend to transfer. Trying to play a PAL VHS on an NTSC device won't work (see the box "PAL vs. NTSC").

PAL vs. NTSC

PAL and NTSC are two color encoding systems – originally intended for analog television – that also apply to VHS tapes, players, and recorders:

  • Phase Alternating Line (PAL) is used in most of Europe, Australia, the Middle East, Asia, and a big part of South America [1]. PAL media run at 25 frames per second (fps) and have a frame size of 720x576. In this article, the example commands are for PAL media.
  • National Television System Committee (NTSC) is used in North America, Japan, parts of South America, and a few other countries [2]. NTSC media run at 29.97fps and have a frame size of 720x480. If you try to reproduce the steps in this article with NTSC devices, you will need to replace these values in the examples.

For a capture device, I use a USBTV007 EasyCAP (Figure 1), which is inexpensive (less than EUR15/may be difficult to find in the US), performs captures of acceptable quality, and has Linux support. Keep in mind that "EasyCAP" is not a commercial brand name; it is a popular term used by Chinese manufacturers to designate cheap, simple USB capture cards. Finding a specific EasyCAP model can be challenging, since manufacturers usually don't provide the full specification list in their product description or similar use cases. See the box "The Other EasyCAP."

The Other EasyCAP

Many EasyCAP devices other than the USBTV007 work under Linux. For more information about these other devices, see the LinuxTV wiki [3], but keep in mind that it is a bit outdated. (In fact, the wiki is so outdated that the EasyCAP model used here is reported to be unable to capture sound at the time of writing.)

Some of these EasyCAPs might require different command-line switches than the ones provided in this article's examples to work. If you have trouble getting your EasyCAP to work, try using a different pixel format option than YUY2 with GStreamer, such as UYVY or YV12 (Table 1). If you use any of these, you might want to supply a different pix_fmt switch to FFmpeg, although this is not really necessary .

Table 1

Pixel Formats

GStreamer

FFmpeg

YUY2

yuyv422

UYVY

yuv422p

YV12

None known

Figure 1: This EasyCAP device works as a cost-effective capture device.

You will also need an SCART to RCA adaptor (Figure 2), which may be purchased for less than EUR10 ($8). The SCART connector is plugged into the VCR output, and the RCA plugs into the EasyCAP connectors. Most RCA connectors are correctly color coded (Table 2).

Table 2

RCA Color Codes

Color

Connection

White

Left audio

Red

Right audio

Yellow

Video

Figure 2: This SCART to RCA adaptor has an extra pin for audio input, which I won't be using.

In addition, I used a laptop with an i5-2467M CPU for my tests, but you can use a computer with less horsepower. A dual-core CPU of 2.5OGHz for each core is the minimum requirement. However, tests showed that the procedures described in this article will result in a barely acceptable loss of frames in the encoding when using this weak of a CPU. Using a drive with a high write speed is also advisable, since the VHS will be transferred and saved to the filesystem in real time.

In terms of software, I used GStreamer with the Ugly plugin for the capture (see the "Why GStreamer?" box) and FFmpeg to transcode. FFmpeg requires x264 and libdfk-aac support. Finally, the SoX sound processing utility is used to clean up the transferred media's audio.

Why GStreamer?

Many tutorials suggest capturing your VCR or cassette player's output directly with FFmpeg or MPlayer. Instead, I chose GStreamer to capture the VCR output and pipe it into FFmpeg, because GStreamer has better error tolerance when dealing with faulty media feeds. This means GStreamer will work better with cheap capture devices that provide bad frame rates (e.g., 25.02fps instead of 25fps for PAL video). This is specially true when muxing the output to formats that support time stamps, such as Matroska [4].

Hardware Detection

First, power up your computer and connect all the hardware as previously described. Then, make sure your operating system properly identifies the capture device.

The USBTV007 EasyCAP will show up as two different capture devices: one for audio and one for video. This model does not work with the PulseAudio sound server, so you might need to tell PulseAudio to ignore the device so that ALSA can manage it instead. To do this, use the graphical tool pavucontrol: Move to the Configuration tab and select the Off profile for USBTV007. If you prefer to work from the command line, then enter,

pactl set-card-profile $card_number off

where $card_number is the identifier of your EasyCAP device for PulseAudio. If you don't know what it is, you can find out by typing:

pactl list cards | grep -E 'device.product.name|device.string'

To list your currently detected audio inputs, enter the command:

arecord -l

This will display the working capture devices detected by ALSA.

Video inputs will show up as files named /dev/video* (e.g., /dev/video0, /dev/video1, etc.). If you have more than one video input – for example, you have a webcam in addition to an EasyCAP – and you are not sure which one is which, you can extract information from each device with the v4l2-ctl command. The following command will display some known properties of /dev/video0:

v4l2-ctl --device=/dev/video0 --list-inputs

The Capture Process

Once everything is connected and detected, it is time to capture the video. To do this, insert a VHS tape into the VCR and play it, and then turn on your capture software, which takes the VCR's output and dumps it into a file on the fly.

The raw video and audio stream can take up a lot of storage, probably more than 100GB, which is why most people prefer encoding that stream into something more manageable as it is captured. Because the computer is capturing the raw VCR output in real time, your encoder must be both fast and CPU-friendly, so as not to affect the data capture negatively. If you try to use an encoding configuration that is too hard on the CPU, it will not have enough time to process each piece of incoming data before the next one arrives, which will result in lost frames and data loss.

Listing 1 performs a lossless capture of the analog input. While not very practical, Listing 1 is provided as a reference. Notice that the buffers are set to zero in order to avoid problems during the real-time video capture. The pixel format, YUY2, is set to the format that the EasyCAP feeds to the computer. Audio is captured in a lossless format at a sampling rate of 48KHz in stereo. The output is dumped to a Matroska file. Listing 1 is tuned for PAL devices (if you are using NTSC, see the "PAL vs. NTSC" box.)

Listing 1

Compressionless Capture Command

gst-launch-1.0 -q v4l2src device="$videodevice" do-timestamp=true norm="PAL" pixel-aspect-ratio=1
    ! video/x-raw,format=YUY2,framerate=25/1,width=720,height=576
    ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0
    ! mux.
  alsasrc device="$alsadevice" do-timestamp=true
    ! audio/x-raw,format=S16LE,rate=48000,channels=2
    ! queue
  max-size-buffers=0 max-size-time=0 max-size-bytes=0
    ! mux. matroskamux name=mux
    ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0
    ! filesink location=vhs.mkv

In Listing 1, replace "$videodevice" with a video input device (e.g., /dev/video1). The "$alsadevice" (in the alsasrc device line) describes an audio input such as hw:1,0 or hw:2,0. The ALSA audio inputs are always named in a "hw:$card,$device" format.

Listing 2 performs lossy compression while carrying the analog stream's capture, resulting in a smaller file. It is configured to capture video for 1 hour and 40 minutes, but you can stop the recording by pressing q at any time.

Listing 2

Compressed Capture Command

ffmpeg -loglevel 32 -t 01:40:00
       -i <( gst-launch-1.0 -q v4l2src device="$videodevice1"
             do-timestamp=true norm="PAL" pixel-aspect-ratio=1
             ! video/x-raw,format=YUY2,framerate=25/1,width=720,height=576
             ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0
             ! mux.
            alsasrc device="$alsadevice" do-timestamp=true
             ! audio/x-raw,format=S16LE,rate=48000,channels=2
             ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0
             ! mux.
            matroskamux name=mux
             ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0
             ! fdsink fd=1
           )
       -c:v libx264 -preset ultrafast -x264opts crf=18:keyint=50:min-keyint=5
       -pix_fmt yuyv422 -c:a flac
       -f matroska file:vhs.mkv

Listing 2 offers a good trade-off. GStreamer captures the VHS output and pipes it to FFmpeg for transcoding (see Table 3). It will encode the raw stream without a noticeable loss of quality and will diminish the output file size by a factor of ten. Because it is CPU-friendly, it allows the capture of the analog data without frame loss. The resulting file is, however, still too big for most users. It is also likely to suffer from audio and video noise, since old analog data is susceptible to acquiring defects because of the media's age. Therefore, the captured video will require postprocessing.

Table 3

FFmpeg Options

Option

Meaning

-t 01:40:00

Sets a duration for the recording.

-c:v libx264

Uses the x264 video codec.

-preset ultrafast

Makes the encoder work as fast as possible.

crf=18

Sets the quality of the encoding. Lower is better, but also implies larger files.

keyint=50

Maximum interval between keyframes. Low values make it easier to seek specific positions in the video [6].

min-keyint=5

Minimum interval between keyframes.

-pix_fmt

Pixel format (the same as EasyCAP).

-c:a flac

Audio is encoded as FLAC, which provides reasonable compression without reduction of sound quality.

-f matroska

Mixes the audio and the video into a Matroska multimedia container.

Audio Cleanup

When you capture the audio of a VHS tape with the process outlined here, it is likely to be noticeably noisy. In addition to the noise contained in the VHS and generated by the VCR (see the box titled "Removing Audio Noise from Old VHS Tapes" ), the capture card also produces its own noise. Especially true when using cheap capture devices, the capture card will always introduce a barely audible buzzing sound to the resulting recording.

Removing Audio Noise from Old VHS Tapes

Some tapes are so worn that you might want to remove more than the capture card noise. If the VHS suffers from a constant hiss throughout the reproduction, you may want to try the following approach.

Instead of extracting a noise sample by the method described for capture card noise, take a sample during a portion of the VHS tape that should be silent. Many commercial tapes have silent parts at the beginning and end. To capture the hissing noise that occurs throughout the tape, record a few seconds of such segments. Then use this sample for generating a noise profile, as described in the article.

This process is highly destructive. Some might call it butchering the VHS, and they may be right. This will remove the hiss, but it may introduce sound artifacts that can be worse than the original noise. Use more conservative noise reduction values (less than 0.2) if you take this route. Experiment until you get the result you want.

To mitigate the noise introduced by the capture card, you can use SoX. First, capture an audio segment that only contains the noise introduced by the capture card. With your EasyCAP connected to the VCR when it is not generating any sound output (e.g., when the tape is paused), capture a few seconds of an audio segment as follows:

gst-launch-1.0 -q alsasrc device=$alsadevice ! wavenc ! fdsink | sox -t wav - -n trim 0 1 noiseprof noiseprofile

This will create a file named noiseprofile that contains a description of the noise introduced by EasyCAP. This profile can be used for reducing the noise in the audio you capture.

Now, the tape's digitized audio can be extracted from its file and cleaned with:

ffmpeg -i vhs.mkv -acodec pcm_s16le -vn vhs_tmpaud.wav
sox vhs_tmpaud.wav vhs_tmpaud-clean.wav noisered general_noise.prof 0.21

A clean version of the audio will be recorded to vhs_tmpaud-clean.wav. The last value of the sox command above defines the denoiser's aggressiveness. Higher values mean more noise reduction, but at the expense of sound fidelity. Denoising is a destructive procedure that may wipe information away. Values that are too high will cause audible sound artifacts that are in fact worse than the original noise. Values between 0.21 and 0.31 are reasonable.

Finally, the audio is muxed back with the video stream. In this example, a video-only stream is extracted from vhs.mkv and stored in vhs_video_only.mkv. Then this video stream is combined with the clean version of the audio just generated, and the whole file is stored as vhs_whole.mkv:

ffmpeg -i vhs.mkv -vcodec copy -an vhs_video_only.mkv
ffmpeg -i vhs_video_only.mkv -i vhs_tmpaud-clean.wav -map 0:v -map 1:a -c:v copy -c:a copy vhs_whole.mkv

The Transcoding Process

You now have a video file with clean audio that uses about 15GB of disk space. This file may be transcoded into some more practical format using any encoder, such as HandBrake or MEncoder. However, I used FFmpeg in my example, because it is easily available and well documented.

For the excess video usually found at the beginning and end of a recording, trim it out with the following command:

ffmpeg -i vhs_whole.mkv -acodec copy -vcodec copy -ss $start_position -to $end_position vhs_trim.mkv

Start and end positions are values in the form hours:minutes:seconds, such as 01:30:50.

Listing 3 transcodes the captured video file and the cleaned audio file into vhs_final.mkv.

Listing 3

Transcoding into the Final Format

ffmpeg -i vhs_trim.mkv
       -vf "crop=(iw-10):(ih-14):3:0,pad=iw+10:ih+14:(ow-iw)/2:(oh-ih)/2,hqdn3d=2:1:2:3"
       -c:v libx264 -flags +ilme+ildct -profile:v high -tune:v animation
       -preset veryslow -crf 26 -c:a libfdk_aac -b:a 224k -f matroska vhs_final.mkv

The second line in Listing 3 is a set of video filters that are applied to the encoding; crop removes the overscan at the picture's borders, and pad replaces it with simple black bands. hqdn3d is a video denoiser (see the "Removing Video Noise" box).

Removing Video Noise

VHS tapes often contain grainy artifacts. This video noise makes the file more difficult to compress with video encoders and delivers poor output quality, as well. Unless your source material is very high quality with no noticeable defects, applying a denoiser video filter to your captured file might be a good idea.

I used the high-quality denoise 3D (hqdn3d) filter in Listing 3. Hqdn3d accepts four different parameters that regulate its aggressiveness. These parameters are luma_spatial, chroma_spatial, luma_tmp, and chroma_tmp. Each accepts values from   to 255. luma_spatial and chroma_spatial affect the dissipation of static noise (the noise that is analyzed in each frame without taking other frames into account). luma_tmp and chroma_tmp describe the treatment of noise that shows up through multiple frames.

Denoising is a destructive action. To remove the grain and video defects, the video is smoothed and can become blurry if an aggressive filter is used (Figure 3). Remember that fine details, such as hair or wrinkles, may be mistaken for video noise and subsequently removed by the filter. For this reason, it is better to use conservative values for the denoiser. Hqdn3d's documentation recommends not using values greater than 10 for the luma_spatial and chroma_spatial and than 13 for luma_tmp and chroma_tmp. These suggested limits are still very high. You might want to experiment with different numbers until you achieve a result you like.

Figure 3: Before (top) and after (bottom) the application of an aggressive denoiser.

The veryslow H.264 preset ensures that the best quality-to-size ratio is obtained at the expense of encoding time (for other presets, see Table 4). The ilme and ildct flags ensure that interlacing information is preserved. The tune:v option serves to adjust the H.264 encoder to the content type being transcoded; I used animation in Listing 3 (for other tune options, see Table 5). libfdk_aac encodes the sound into Advance Audio Coding (AAC) at 224Kbps. (See also the "Patent-Encumbered Codecs" box.)

Table 4

H.264 Presets

ultrafast

superfast

veryfast

faster

fast

medium

slow

slower

veryslow

placebo

Table 5

Some Possible Tune Options

Tune

Purpose

animation

Cartoons

film

High-quality movie content

grain

Source material with a lot of grain

Patent-Encumbered Codecs

H.264 and AAC video and audio compression standards, respectively, are certainly unsuitable for hard-core FOSS proponents who live in regions where software patents apply or plan to distribute content to those regions. If codec licensing or patent trolling is a problem, alternative codecs can be used.

VP8, Google's attempt to establish a patent-free competitor to H.264, is good enough, but it has two problems. First, it is less popular, so you are less likely to find support for it in home appliances. Second, Google has the bad habit of booting projects up, backing them with lots of resources, then suddenly realizing they are not profitable and dropping them altogether. This is the current state of VPX codecs. While VP8 is still useful, you can't count on Google to update the specification or their libraries.

Vorbis and Opus are good lossy audio codecs that can replace AAC, but as with VP8, they lack support from many domestic multimedia appliances.

Matroska is an open multimedia container that does not need replacement to keep your movie collection FOSS friendly.

The profile:v switch selects an encoding profile. Files encoded with a high profile setting will be playable by most modern multimedia appliances. The baseline profile may be used in order to ensure compatibility with older appliances, at the expense of compression efficency. [5]

Wrapping Up

The last thing you need to do is provide (optional) metadata to the file. Listing 4 shows an example metadata.xml file. Matroska metadata is not well documented, but if you want to delve deeper, see some examples online [7].

Listing 4

Example Metadata File

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Tags SYSTEM "matroskatags.dtd">
<Tags>
  <!-- movie -->
  <Tag>
    <Targets>
      <TargetTypeValue>50</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>TITLE</Name>
      <String>Happy Rottweiler</String>
    </Simple>
    <Simple>
      <Name>DIRECTOR</Name>
      <String>Rubén Llorente</String>
    </Simple>
    <Simple>
      <Name>DATE_RELEASED</Name>
      <String>1992</String>
    </Simple>
    <Simple>
      <Name>ORIGINAL_MEDIA_TYPE</Name>
      <String>VHS</String>
    </Simple>
  </Tag>
</Tags>

The final step is to merge the metadata with the Matroska file, as follows:

mkvpropedit "vhs_final.mkv" --edit info --set "title=Happy Rottweiler" -t global:metadata.xml --edit track:a1 --set "language=spa" --set "flag-default=1"

This command sets the file's title Happy Rottweiler, merges the metadata file metadata.xml with the vhs_final.mkv file, and sets the audio track number 1 language value to Spanish. The only audio track is set as the default audio track.

Conclusion

Saving your old VHS tapes or audio cassettes is quite easy with minimal hardware expenses. Although video and audio encoding and analog to digital conversion is surrounded by a lot of black magic, this article should provide a starting point to converting your old VHS tapes for personal use.

You can easily use the procedure presented here to save audio cassettes just by capturing and encoding the audio by selecting a suitable codec, such as Ogg, FLAC, or Opus, and a suitable container. Matroska does the trick (in fact, files with the mka extension are Matroska files with audio-only content), but its use for this purpose is not widespread.

If the steps presented here look too cumbersome to perform manually, you can always use the LinuxTV's V4L capturing script [8] to automate the process.

The Author

Rubén Llorente is a mechanical engineer, whose job is to ensure that the security measures of a small clinic's IT infrastructure are both legally compliant and safe. Additionally, he is an OpenBSD enthusiast and a weapon collector.