Mike Slinn
Mike Slinn

Extracting Audio from an MP4 as 32-bit WAV

Published 2021-11-04.
Time to read: 1 minutes.

This page is part of the av_studio collection, categorized under Media.

My Sony Alpha 7 Mark iii camera creates mp4 files with good quality stereo audio. I wanted to extract the audio to a 32-bit wav file, so I could work on it further in Pro Tools. Here is a bash script I wrote to do that:

#!/bin/bash

# $1 Input file path

function help {
  if [ "$1" ]; then printf "$1\n\n"; fi
  echo "$(basename $0) - Extract audio stream from an mp4 file and save as 32-bit wav
  
Usage: $(basename $0) filename
"
  exit 1
}

if [ -z "$1" ]; then help "Error: no media file name specified"; fi

if [ ! -f "$1" ]; then help "Error: '$1' not found"; fi

filename="$( basename -- "$1" )"
path="$( dirname "$1" )"
extension="${filename##*.}"
filename="${filename%.*}"

ffmpeg \
  -i "$1" \
  -vn \
  -acodec pcm_f32le \
  -ar 44100 \
  -ac 2 \
  "$path/$filename.wav"

This is a sample usage:

Shell
$ mp4ToWav "Video Files/Descending C to G djembe"
ffmpeg version 4.3.2-0+deb11u1ubuntu1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10 (Ubuntu 10.2.1-20ubuntu1)
  configuration: --prefix=/usr --extra-version=0+deb11u1ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55fc3d8d1f80] st: 0 edit list: 1 Missing key frame while searching for timestamp: 1001
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55fc3d8d1f80] st: 0 edit list 1 Cannot find an index entry before timestamp: 1001.
Guessed Channel Layout for Input Stream #0.1 : stereo
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Video Files/Descending C to G djembe.mp4':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    creation_time   : 2021-10-31T19:00:25.000000Z
  Duration: 00:09:06.55, start: 0.000000, bitrate: 51575 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709/bt709/iec61966-2-4), 1920x1080 [SAR 1:1 DAR 16:9], 49492 kb/s, 59.94 fps, 59.94 tbr, 60k tbn, 119.88 tbc (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(und): Audio: pcm_s16be (twos / 0x736F7774), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Sound Media Handler
    Stream #0:2(und): Data: none (rtmd / 0x646D7472), 491 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Timed Metadata Media Handler
      timecode        : 07:09:43:54
Stream mapping:
  Stream #0:1 -> #0:0 (pcm_s16be (native) -> pcm_f32le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'Video Files/Descending C to G djembe.wav':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    ISFT            : Lavf58.45.100
    Stream #0:0(und): Audio: pcm_f32le ([3][0][0][0] / 0x0003), 44100 Hz, stereo, flt, 2822 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Sound Media Handler
      encoder         : Lavc58.91.100 pcm_f32le
size=  188304kB time=00:09:06.55 bitrate=2822.4kbits/s speed= 153x
video:0kB audio:188304kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000059%

The original mp4 was 3.4GB, and the output wav was 188MB.