Skip to content

Gemini Security Filtering

When using Gemini AI to perform translation or speech recognition tasks, you may sometimes encounter errors such as "Response content is flagged".

image.png

This is because Gemini has security restrictions on the content it processes. Although the code allows for some adjustments and the most lenient "Block None" setting has been made, the final decision on whether to filter is still determined by Gemini's comprehensive assessment.

Gemini API's adjustable safety filters cover the following categories, and content not listed here cannot be adjusted through code:

CategoryDescription
HarassmentNegative or harmful comments targeting identity and/or protected attributes.
Hate SpeechRude, disrespectful, or profane content.
Sexually ExplicitContains references to sexual acts or other obscene content.
Dangerous ContentPromotes, facilitates, or enables harm.
Civic IntegrityElection-related queries.

The table below describes the blocking settings in the code that can be used for each category.

For example, if you set the blocking setting for the Hate Speech category to Block a few, the system will block all parts that contain a high probability of hate speech content. But it allows any parts that contain a low probability of dangerous content.

Threshold (Google AI Studio)Threshold (API)Description
Block NothingBLOCK_NONEAlways display, regardless of the likelihood of unsafe content
Block a FewBLOCK_ONLY_HIGHBlock when there is a high probability of unsafe content
Block SomeBLOCK_MEDIUM_AND_ABOVEBlock when the likelihood of unsafe content is medium or high
Block MostBLOCK_LOW_AND_ABOVEBlock when the likelihood of unsafe content is low, medium, or high
Not ApplicableHARM_BLOCK_THRESHOLD_UNSPECIFIEDThreshold is not specified, use the default threshold to block

In the code, BLOCK_NONE can be enabled through the following settings

safetySettings = [
    {
        "category": HarmCategory.HARM_CATEGORY_HARASSMENT,
        "threshold": HarmBlockThreshold.BLOCK_NONE,
    },
    {
        "category": HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        "threshold": HarmBlockThreshold.BLOCK_NONE,
    },
    {
        "category": HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
        "threshold": HarmBlockThreshold.BLOCK_NONE,
    },
    {
        "category": HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        "threshold": HarmBlockThreshold.BLOCK_NONE,
    },
]

model = genai.GenerativeModel('gemini-2.0-flash-exp')
model.generate_content(
                message,
                safety_settings=safetySettings
)

However, it should be noted that: Even if all are set to BLOCK_NONE, it does not mean that Gemini will allow the relevant content to pass. It will still infer security based on the context and filter accordingly.

How to Reduce the Probability of Security Restrictions?

In general, the flash series has more security restrictions, and the pro and thinking series models have relatively few. You can try switching different models. In addition, when sensitive content may be involved, sending less content at a time and reducing the context length can also reduce the frequency of security filtering to a certain extent.

How to Completely Prohibit Gemini from Making Security Judgments and Allow All of the Above Content to Pass?

Bind a foreign credit card and switch to a paid monthly premium account

This is a Tool to Transcribe Audio and Video to SRT Subtitles Using Gemini AI

Pre-packaged Version Download Address

The pre-packaged version is only applicable to Win10/11. For Macos and Linux systems, please use source code deployment

Baidu Netdisk Download: https://pan.baidu.com/s/10gJVMa5L3wnzlf1tFd9euw?pwd=dtpt

GitHub Download: https://github.com/jianchang512/gemini-speech2srt/releases/download/v0.3/GeminiAI-speech2srt-0.3.7z

Audio and video content has become an important carrier for us to acquire knowledge and share opinions. Efficiently converting audio and video content into text, especially converting it into subtitles with precise timelines, is usually done using OpenAI's open source Whisper.

The emergence of Gemini AI brings us new solutions. With its powerful natural language processing capabilities, it can quickly and accurately transcribe audio and video content into text. And Gemini AI provides a considerable amount of free daily quota, which is enough to meet the daily needs of audio and video transcription.

However, although sending complete audio and video files directly to Gemini AI can quickly obtain SRT format subtitles, the timeline is often not accurate enough. This is mainly because Gemini AI may have timeline deviations when processing long audio files.

To solve this problem, I developed a simple and easy-to-use tool that automatically completes the following steps:

  1. Smart Slicing: Use the VAD (Voice Activity Detection) model to intelligently slice audio and video files into small segments.
  2. Transcribe Each Segment: Send each segment separately to Gemini AI for transcription.
  3. Precise Assembly: Reassemble the transcription results into a complete SRT subtitle file in chronological order to ensure the accuracy of the timeline.

No complicated settings are required, just simple operations to get accurate SRT subtitles!

image.png

Advantages of Gemini AI:

  • High Accuracy: Gemini AI is based on a powerful AI model and has extremely high speech recognition accuracy, which can accurately capture the content in audio and video.
  • Fast Speed: Thanks to Gemini AI's powerful computing power, the transcription speed is very fast, greatly saving your time.
  • Free Quota: Gemini AI provides a sufficient daily free quota, which is enough to meet the daily needs of audio and video transcription and reduce usage costs.
  • Support for Multiple Formats: This tool supports common audio and video formats without the need for additional format conversion.
  • Precise Timeline: Through smart slicing and segment-by-segment transcription, the generated SRT subtitle timeline is accurate and error-free.

How to Use

  1. Get Gemini API Key: First, you need to have a Gemini API Key. If you don't have one yet, please follow the instructions at the end of the article to get one.
  2. Fill in API Key: Paste your Gemini API Key into the GeminiAI Key input box of the tool.
  3. Select Model: It is recommended to select the gemini-2.0-flash-exp model, which has better results and a sufficient free daily quota.
  4. Set Proxy (Optional): If you are using it in an environment without scientific internet access, please fill in the HTTP proxy address and port.
  5. Select File: Click the large area above to select the audio or video file you want to transcribe.
  6. Start Transcribing: Click the "Start" button, and the tool will automatically complete the process of slicing, transcribing, and assembling subtitles.
  7. View Results: After the transcription is complete, click "Open Result Folder" to find the generated SRT subtitle file.

gemini.gif

How to Get Gemini API Key

  1. Preparation: Make sure you have scientific internet access.
  2. Visit Google AI Studio: Open the website https://aistudio.google.com/apikey.
  3. Register/Login: If you don't have a Google account, please register one first.
  4. Create API Key: Click the "Create Key" button.
  5. Copy API Key: Copy the automatically generated API Key.

image.png