Thanks to the recent innovations with LLM, and the fantastic work of justine.lol, we now have open source large language model to play with. While those models have no logic, their capacity to compress language is useful for summarizing.

In this post, we will see some bash functions to transform a youtube video into a summary that can quickly be read to get the gist of what is happening.

These functions can also be used to summarize any content that can be converted to raw txt format.

Download video transcripts

The model deals with text, so we download the video transcripts using yt-dlp, an open source command line interface to youtube and various other video providers; then do some cleaning with ex.

function download_video_transcript() {
	cd "$(mktemp -d)" || return
	yt-dlp --write-auto-sub --skip-download "$1"
	cat <<EOF >process.vim
silent! 1,3d
silent! g/<\/c>/d
silent! g/-->/d
silent! g/^ *$/d
silent! %!uniq
%p
quit!
EOF
	ex -s -c "source process.vim" *.vtt >clean_transcript.txt
	summarize-txt clean_transcript.txt
}

Note the option --write-auto-sub which downloads the automatically generated subtitles. One could also use --write-sub which downloads the authors subtitles, but some videos lack subtitles so I found this script more robust by always downloading auto-generated sub. See this page for more subtitles options.

Iteratively reduce the transcript in smaller summaries

The number of token a model can ingest is limited: if the transcript is too long, the model fails. To avoid so, we ask the model to iteratively reduce parts of the transcript and assemble a summary. The whole process is repeated as necessary.

function summarize-txt() {
	cat <<EOF >process2.vim
silent! 1,$join
w
quit!
EOF
	cp "$1" summary.txt
	while true; do
		size=$(wc -c summary.txt | awk '{print $1}')
		if [ "$size" -lt 1000 ]; then
			break
		fi
		split -b 10000 summary.txt
		parts=(x*)
		rm summary.txt
		pwd
		echo "# Summarizing ${size} bytes, split in ${#parts[@]} parts" | tee -a process.txt
		for part in "${parts[@]}"; do
			ex -s -c "source process2.vim" "$part"
			echo "## Summarizing ${part}" | tee -a process.txt
			summarize-txt-once-api "$part" | tee -a summary.txt | tee -a process.txt
			rm "$part"
		done
	done
}

Summarize a text via llamafile api

In order to summarize a text via api, we use a combination of jq and curl

function summarize-txt-once-api() {
	echo '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "You are a summarization assistant, skilled in summarizing any complex text that the user sends with precise and terse output."
    },
    {
      "role": "user",
      "content": ""
    }
  ]
}' >content.json

	jq --rawfile a "$1" '.messages[1].content = ($a)' content.json >content_updated.json
	curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d @content_updated.json | jq -r ".choices[0].message.content"
}

Download and start a llamafile model

Head over to justine.lol’s huggingface profile and dowload some llamafile. As of 2024-04, Mixtral yields very good results on a macbook M2 laptop.

There are various ways to invoke the model as a bash command, starting and stopping the model when it is done. While it is convenient for one-off tasks, summarizing a long text might take longer, so we want to start the model as a server to only load the model once and share it for all concurrent queries. To do so we will follow later instructions from the author’s blog:

wget https://huggingface.co/jartine/Mixtral-8x7B-v0.1.llamafile/resolve/main/mixtral-8x7b-instruct-v0.1.Q5_K_M-server.llamafile
chmod +x mixtral-8x7b-instruct-v0.1.Q5_K_M-server.llamafile
./mixtral-8x7b-instruct-v0.1.Q5_K_M-server.llamafile --nobrowser

Conclusion

By the end of this blog, you should be able to run:

download_video_transcript "ytsearch:the PARA method for organizing one self"

and get a small summary.txt file that contains the gist of the video.

If that video was very rich, the file limit might have forced the model to omit important key results from the video. At any rate, you can always inspect the file process.txt that contains earlier and bigger versions of the summary file.

Summarize youtube videos from the command line tool

Download video transcripts

Iteratively reduce the transcript in smaller summaries

Summarize a text via llamafile api

Download and start a llamafile model

Conclusion

Leave a Comment Cancel reply