Video to Text: Amazon Transcribe with S3

How lazy can you possibly get? Well, that’s my story on how to transcribe text from boring videos, and checking for the keywords, before even checking the video/audio.

For the start, first cornerstone was to actually get the video. Most of the Streaming Players use HLS media player that heavily rely on m3u8 extension (those who remember playlists in WinAmp, might remember it), that sets base URL for all the video segments that will be streamed.

If you hit “Play” on the media player, while having your DevTools Network tab open, you’d see something like that:

After some time google around Python/PHP http bindings to fetch the content, the most optimal solution was ffmpeg:

ffmpeg -i http://example.org/playlist.m3u8 -c copy -bsf:a aac_adtstoasc output.mp4

Once done, you can check the video for consistency (either with -i command, or simply scrolling through the video).

Just to save on whole procedure, we convert mp4 to only mp3 audio stream with “`ffmpeg“` once again:

ffmpeg -i video.mp4 -b:a 192K -vn music.mp3

Since we have mp3 ready for being check, Amazon Transcribe kicks in, but you need to store your mp3 somewhere. The easiest way is to get yourself S3 bucket from Amazon, and point S3 URL of the file using Transcribe.

Transcribe Admin Panel.

Overall result, of the same 1.5 hrs video being converted into transcribed text, with enabled/disabled speakers identification. Approximately 25-30 mins to get 1.5 MB JSON file of the text, with separate spk_1|spk_2 and time codes.

Anime: My Top 5

1. Neon Genesis Evangelion (1995)

Neon Genesis Evangelion coverIt’s iconic, controversial due to last two episodes of the series. For most of the anime filmed in 90’s, that period will be associated with the name of Hideaki Anno – director of NGE (Neon Genesis Evangelion) series.

Director Hideaki Anno’s depression is what led to the dark themes of Neon Genesis Evangelion. Budgetary problems and parental complaints about content led to the original ending being scrapped and replaced with an extremely limited-animation ending breaking from the main plot. A movie, End of Evangelion, was later made based in part on the original planned ending and in part on Anno’s increasing frustration with the otaku fanbase. The series’ mix of psychoanalysis, religious symbolism, and genre deconstruction proved extremely influential on mature anime in the late ’90s onward. The Japan Media Arts Festival in 2006 ranked it as the most popular anime of all time.

 

2. Ghost In The Shell 2.0 (remastered from 1995)

Ghost in the shell coverMost of the cyberpunk we get in films for the last 2 decades is a tribute to the creation of Mamoru Oshii – Ghost In the Shell. If by any chance, you saw the movie – forget what you saw – it barely scratching the surface of philosophy-oriented storytelling of this movie.

Ghost in the Shell is a futuristic thriller with intense action scenes mixed with slower artistic sequences and many philosophical questions about one’s soul, and human identity in such an advanced age of technology.

3. Planetes (2004)

Ai Tanabe wanted to pilot the ships in space, but got into space junk cleaners. It’s not cyberpunk, there’s not much of anime as we usually used to see. It’s a story of man pursuing his dream, where humanity goes, and who we are in space.

Planetes is an unconventional sci-fi series that portrays the vastness of space as a backdrop for the personal lives of ordinary people—people who may have been born on Earth, but whose hopes and dreams lie amongst the stars. Planetes is the winner of the 2005 Seiun Award for Best Dramatic Presentation in science fiction.

4. Cowboy Bebop (1998)

Well-balanced with high density action and light-hearted comedy, Cowboy Bebop is a space Western classic and an homage to the smooth and improvised music it is named after. It’s atmospheric, easy going – real contract to the above ones mentioned.

Cowboy Bebop‘s biggest influence has been in the United States, where it premiered on Adult Swim in 2001 with many reruns since. The show’s heavy Western influence struck a chord with American viewers, where it became a “gateway drug” to anime aimed at adult audiences.

5. Eureka Seven (2005)

Just to dilute a super philosophical list of sci-fi, cyberpunk etc. list, here’s a light one – classical Mech genre tv series.

The anime has won multiple awards including “Best Screenplay,” “Best Television Series,” and “Best Character Designs” at the 5th Tokyo Anime Awards

 

 

#delete campaigns: social solidarity vs privacy

Uber: Travel ban

January 2017, we witnessed #deleteuber social media campaign. The movement erupted after Trump’s ban on travel ban from Muslim-majority countries, when NYC taxi drivers went on strike. At the very same moment Uber announced “surged pricing on JFK airport” being turned off.

February 2017 benchmarks shown 200,000 accounts being deleted as an act of solidarity against US President decision.

Cambridge Analytica: Elections

March 2018, Christopher Wylie, whistleblower from Cambridge Analytica (has nothing to do with famous university), gives an interview to Guardian on how the company was collecting Facebook user profiles and presumably helped targeting elections campaign for Republican Party to win the elections.

The whole social media just went nuts on the subject. A chance of your social profile being harvested for micro-targeting to form your opinion on any sociopolitical matter, launched yet another delete campaign – #deletefacebook. Today, it’s been reported on 87m profiles may’ve been leaked to Cambridge Analytica, which is said to be a part of SCL Group.

Some details on who these folks are:

SCL’s involvement in the political world has been primarily in the developing world where it has been used by the military and politicians to study and manipulate public opinion and political will. It uses what have been called “psy ops” to provide insight into the thinking of the target audience. According to its website, SCL has influenced elections in Italy, Latvia, Ukraine, Albania, Romania, South Africa, Nigeria, Kenya, Mauritius, India, Indonesia, The Philippines…(c) Wikipedia

What’s quiet interesting about this whole story, that it’s emphasised the privacy leak at first place. Next week it twisted into yet-another-Trump fault, and all the hell broke loose in social networks.

Frankly speaking, this Trump for/against campaign is not my thing, I’m not  a US citizen. I didn’t vote. Thus, I don’t care. American elections is solely the matter of US people.

Technically speaking, as a person who reads and does some IT things, it breaks down to the subject of privacy, and the medium that we use in day-to-day routines.

If you’re not paying for the product, you are the product

Whenever you use any social medium, you share your private information. Those crazy useless quizzes, asking for your locations, ads rotation, bounce rates. It was just the matter of time, when some company will appear on the horizon and start crunching your data for its own purposes. Marketing tools in combination with psychology and IT, might give you a proper railgun in social science and opinion forming.

It’s your decision to support or ignore #deletefacebook movement. Edward Snowden gave an interview on the matter, that has some insights on your data privacy and the state control. He might be right, that it’s us – our generation – that will impose the control of our personal data, or it’s too late.

What’s after senior developer by Christian Heilmann

It is painful to see how clumsy companies are in trying to keep their techies happy. We do team building exercises, we offer share options. We pay free lunches and try to do everything to keep people in the office. We print team T-shirts and stickers and pretend that the company is a big, happy family. We pay our technical staff a lot and wonder why people are grumpy and leave.

What gets us going is a feeling of recognition and respect. And only peers who’ve been in the same place can give that. There is no way to give a sincere compliment when you can’t even understand what the person does.

Great article from Christian Heilmann about the career ladders and what comes after senior developer.

CakePHP CsvMigrations: prototype in 3 2 .. Done!

Every programmer is tired of coding yet another login form. Yet another CRUD view module. So today I’ll show you an example where our laziness can get you building prototype systems without a single line of extra code.

One of the tasks, that we had to face when developing Qobrix CRM, was fast prototyping of the system. The result of ultimate laziness and DRY concept resulted in cakephp-csv-migrations plugin that we try to use pretty much everywhere, while delivering the system.

Your application is not unique

Whatever you request for your Prototype is mostly based on same functionality:

  • CRUD views
  • Basic CRUD actions
  • Event/Trigger system that allows you mutating the data

If we dive deeper into these points, whatever you work with is form based – your input fields are the minimum atomic unit of interaction: strings, longtexts, datetime. Looks familiar? Exactly, database data types.

In certain cases, you store dates in strings, names in varchars, or even longtext. At this moment we come to the point of having a binding mechanism of your application logic with your storage engine. For simplicity reasons – I’d base this example on RDBMS like MariaDB, MySQL, etc.

Changing those binding might be difficult for the user. As the supplier of the system, you don’t know who’s going to deal with the system: some companies don’t have IT departments, but need to modify things rapidly. The same applies to developers level of expertise for the system. We wanted to make it as simple as possible.

Preparing the App

Note: If you already have a CakePHP application running, just composer require qobo/cakephp-csv-migrations, and you can skip this part.

Theory is boring without examples, so I’ll try to show you a basic thing on how to expand the system with extra modules. For simplicity reasons, I’ll base it on project-template-cakephp template that we frequently use. It already has some dependencies, as well as cakephp-csv-migrations plugin as part of cakephp-utils.

composer create-project qobo/project-template-cakephp baking_app
cd baking_app
./bin/build app:install DB_NAME="baking_app",CHOWN_USER=$USER,CHGRP_GROUP=$USER,PROJECT_NAME="My Baking App"
./bin/phpserv

That’s enough to check that your app is up and running. For basic creadentials and stuff, you can check .env file that was generated by the Robo build scripts.

Baking Recipes Module

Here comes the baking part. We’re going to make a simple recipes module to store our favourite recipes.

./bin/cake bake csv_module Recipes
./bin/cake bake csv_migration Recipes

First command will create dummy MVC instances for CakePHP: Model/Entity, Controller and ApiController files, based on which the second script will verify that you can bake a migration script (based on Phinx migrations).

If you’ll look into migration file created in config/Migrations/<timestamp>_Receipts<timestamp>.php, you see something like that:

<?php
use CsvMigrations\CsvMigration;

class Recipes20180126154309 extends CsvMigration
{
        public function change()
    {
        $table = $this->table('recipes');
        $table = $this->csv($table);

        if (!$this->hasTable('recipes')) {
            $table->create();
        } else {
            $table->update();
        }

        $joinedTables = $this->joins('recipes');
        if (!empty($joinedTables)) {
            foreach ($joinedTables as $joinedTable) {
                $joinedTable->create();
            }
        }
    }
}

Where are the fields? That’s the point where all the magic happens.

CsvMigrations plugin provides you with vast number of input types that can bind to basic data types of your database. They’re stored in config/Modules/Recipes/db/migration.csv file. We’ll expand it a bit:

FIELD NAME,FIELD TYPE,REQUIRED,NOT SEARCHABLE,UNIQUE
id,uuid
name,string
meal_type,list(meal_types)
recipe,text
created,datetime
modified,datetime
created_by,related(Users)
modified_by,related(Users)

I’ve added name, type and recipe fields that can be handled by varchar and longtext data types in the database. Let’s cook it:

./bin/cake migrations migrate

And we are done! You noticed list(<list_name>) type used within migration.csv. This FieldHandler type is used for defined option lists for rendering Select boxes in you form. The lists are stored in config/Modules/Common/lists/meal_types.csv:

VALUE,LABEL,INACTIVE
breakfast,Breakfast,
dinner,Dinner,
supper,Supper
Where are my views?

If you start up the application, and navigate to http://localhost:8000/recipes/add you’ll see something like that:

add recipes form

Now we need to add fields to CRUD form. CsvMigrations can help you with that. All your form fields are located in config/Modules/Recipes/views:

PANEL NAME,FIRST COLUMN FIELD NAME,SECOND COLUMN FIELD NAME
Details,name, meal_type
Recipe,recipe

The example above is for add/edit.csv files being modified. Reloading the add page:

add form complete
Complete add form
Conclusion

Now you’re ready to work with basic CRUD. All the common CRUD logic is already located in cakephp-csv-migrations plugin that will handle API requests for the index page for DataTables grid loading. If you want to change its behavior, you can always override action methods in `RecipesController`.

 

WordPress Gutenberg: it’s not about Text Editor

I wasn’t paying much attention after the announcement of Gutenberg projects from WordPress guys back in 2010’s.

I never had any dramatic impacts by CKEditor embedded in the WordPress admin panel. I still think it’s one of the best examples of UI/UX text editors on the Web. The whole development process caught my attention due to React licensing issue that got the Internet buzzing about for couple of months, until Facebook changed it.

And then I checked this video on the future of WYSIWYG editor and Gutenberg’s impact on the WordPress ecosystem.

This is huge! The whole ecosystem will change its standards of writing plugins/themes. The concept of expanding viewports going beyond the classical monitor resolution, including wearables and other portable devices. Block architecture. Enough with spoilers – just watch the video.

Fairwell 2017. New Year – new challenges

The end of this crazy year is almost around the corner, and I guess it’s the time to summarise it.

Traveling

We finally got some time to travel abroad. Scotland was the destination. Nadia and I think on planning one more trip there; this time checking the West Coast of the country. Edinburgh, is definitely the city to consider moving, in case I’ll ever get tired of Cyprus.

Working

2017 passed by the aegis of “Hold my beer!” Number of really challenging projects that were successfully launched in 2017. Couple of zombie-projects that had to be resurrected from the nearly dead condition so they could survive Black Friday and Christmas sales. And they did!

Surprisingly found myself coding lots of JavaScript at work and free time. Never thought of becoming a frontend developer, and still not planning to, but JS appears on my way more frequently then I expected (huh!).

Free Time

Is there any, duh?.. 2017 was full of different pivoting moments. Health wise, 3-month gym challenge proved that I can’t stand without it for long. It seems that this hobby is here to stay.

Hiking. I’d like to keep it on weekly basis, but apparently monthly period is more realistic (plus Cyprus is running out of hiking routes quiet fast with such pace!).

Back to normality. Apart of reading tech books, finally found some place for non-technical literature. Definite challenge of 2018 would be to finalise Hugo Awards list, which makes another 8-10 books remaining from 2017.

2018 Resolutions

Everyone likes lists. Lists are easier to memorise. It might help me remember, what I was planning for 2018. At least for a week.

  • Travel more!
  • More sports (Gym is fun, but with goal setting, is more challenging)
  • Figure out what kind of monster is(are): Python, Swift.
  • Pump my JavaScript/PHP madskillz.
  • Read more. A lot more.
  • Survive after quitting cigarettes.

Well, this looks like a list that I can accomplish, or at least try to 🙂

 

Happy New Year y’all! Now lots of food, festive mood, and wishes health, happiness, and whatever crazy comes in mind…

Over & out.

Puppeteer: NodeJS browser automation

Puppeteer is a Node library which provides a high-level API to control headless Chrome over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome.

Demo is woth thousand words. API Documentation covers all the aspects of browser emulation/handling I could think of..

Whether it will be a replacement of NightwatchJS or a sub-component in the current end-to-end stack – fun times of trials & errors will tell.