Assembly AI recently released their Conformer AI 2, which was primarily for transcription but could also be used for summarization, Tone, Topic, and other things. Assembly AI has released its new framework, which has a slightly different use-case scenario than Conformer AI, but the underlying technology is the same. In this, the audio is first transcribed, and then you can ask questions using LeMUR.

The company initially opened this tool to a handful of users in April, and now it is quite stable and accurate for specific tasks. For example, you can process your calls, videos, or podcasts with LeMUR and then ask related questions. So let’s see how it works and how you can use it without delay.

How to use LeMUR AI for Video-to-text to ask questions

It is quite straightforward, and you do not need any coding skills to use it. So let’s start by transcribing your audio file, video, or media file with audio so you can ask questions. It is somewhat like asking questions about your own dataset. The best thing about it is that it is free for everyone, without any restrictions or login requirements.

  • Open AssemblyAI LeMUR Playground.
  • You can upload your audio file and a text box to paste the YouTube link there.
  • We have pasted the YouTube video link and then clicked on Start processing.
  • After that, it will transcribe your file, and you will get a list of features. Simply click on the required feature to get two options: Question and Answer and Custom Summary.
  • Click on your requirement, and it will generate a response like a chatbot.
  • That’s it!

I would suggest you experiment with it and check the company’s resources, where you can find a prompt guide on using it better.

Features of LeMUR AI

LeMUR AI has several use-case scenarios that make it better than similar AI tools. Check out the following list of extensive and productive features compared to the company’s other product, AI 2.

  • Custom Summary: You can extract a summary from your audio.
  • Question and Answer: It highlights that you can ask questions, which will give you answers based on your uploaded media file.
  • Action Items: You can directly request action tasks from your media file, which can help you extract actionable items from call recordings or meetings.

If you need to use multiple video files and if you need to work with large videos with at least 1 hour of audio, then check out the LeMUR API. The features we have listed are available on other platforms, like Microsoft Teams, but those are paid, while this is free. The reason behind this is that Microsoft has integrated it into its Meeting app, and it comes in real-time, whereas LeMUR AI is a custom solution that can be used in post-processing. If any third-party company uses it, they will also charge for it, and the company will also charge for it.

class="wp-block-heading">Assembly AI LeMUR API

The company has also released an API to integrate into your apps and services. You can find the SDK on the site and programme it according to your task’s fine-tuning needs. You may want to summarise. As I mentioned, you can fine-tune it, including context and how you expect its response, which may be in a certain format.

So I told you about its primary focus at the beginning. You can use LeMUR to ask questions based on your media file. It will understand it, and with the help of the API, you can push it further, such as by adding some specific questions to it in advance to get its Q&A at last when it processes the data.

If you want to use it for calls or meetings, you may need help remembering what actions to take, so you can use Action Items. It will share action items in the response. There are quite flexible usage scenarios, depending on what you want to focus on. You can use its API per your needs; you can use up to 1 million tokens, which is almost 1 hour of audio data. Suppose you need to process multiple files; you can do that too. If you are interested in integrating Assembly AI into your apps and services, you should check out Assembly AI’s documentation, especially if you are a developer.