tags: python psql This is a simple example of storing info about `.mp4` files. Basically this parses the output of `ffprobe` go get the info. A first pass (not here) constructs a `.json` file containing a dict, the keys of the dict are the paths to the files, the values are the output of `ffprobe -print_format json -show_format -show_streams file.mp4`. The `Psql1` class is a convenience wrapper to do most of the psycopg boilerplate. The `jda_time` module is a personal module with a few time related functions. The `jda1` module is a big pile of convenience functions, such as `rjf` for 'read json from'. The `tohms` function takes a number of seconds, say 3661, and turns it into a string like `1h1m1s` (I tend to use this format when storing durations in filenames, making it easy to see the duration of a file without either viewing details in a file manager or running `ffprobe` or something). ```py from jda_psql import Psql1 from jda_time import tohms from jda1 import rjf class App(Psql1): def go(self): cur = self.cur ex = cur.execute ex("DROP TABLE IF EXISTS films") ex("""CREATE TABLE films ( id SERIAL PRIMARY KEY, path TEXT, filename TEXT, duration REAL, duration_hms TEXT, width INTEGER, height INTEGER, format JSONB, streams JSONB )""") data = rjf("info.json") for k,v in data.items(): fmt = v["format"] streams = v["streams"] path = k filename = k.split("/")[-1] duration, width, height = self.get_info_from(v) duration_hms = tohms(int(duration)) print(path,filename,duration,duration_hms,width,height) ex("""INSERT INTO films (path, filename, duration, duration_hms, width, height, format, streams ) VALUES ( %s, %s, %s, %s, %s, %s, %s, %s )""",( path, filename, duration, duration_hms, width, height, self.to_jsonb(fmt), self.to_jsonb(streams))) def get_info_from(self,vinfo): width = 0 height = 0 duration = 0 duration = float(vinfo["format"]["duration"]) streams = vinfo["streams"] for stream in streams: if stream["codec_type"] == "video": width = stream["width"] height = stream["height"] break return duration, width, height app = App(host="...",user="...",password="...",dbname="...") app.main() ``` ## Example queries ``` # note ilike for case insensitive like select filename from films where filename ilike '%space%'; # how to dig into an array of objects select filename, duration, format->'duration', streams->0->'codec_type' from films limit 1; # count number of streams in a file select filename, jsonb_array_length(streams) from films limit 4; ``` Now anything more complex digging into the json is best done by simply fetching all the json and handling it in javascript or python, rather than trying some clever sql query.