AI Moderator

AI Moderator

Intear

AI Moderator is a bot that helps you automatically delete messages or ban users who send spam or other inappropriate content. It's basically the same as asking ChatGPT “Is this message spam? (text)”, but with more customizations, and completely automated.

IMPORTANT: There are tips for setting up the prompt in the end. This is very important, and can help you get the maximum out of the bot, and avoid false positives.

Use cases

1. Delete links: Automatically delete messages that contain links. It can detect all ways of writing a link, like `example.com`, `example[.]com`, `ex*amp*le dot com`, etc., because it uses AI instead of programmatical checker.

2. Remove spam: Delete messages that contain unwanted content. If you don't want to see crypto airdrops or stuff like this, you can set up AI Moderator to delete these messages, just tweak the rules a bit. It can also detect spam inside images, because it uses AI that has vision capabilities.

3. NSFW content: Automatically delete messages that contain NSFW content, including text, images and (soon) videos. Or you can say “delete images, but not text” and it will do that.

4. Remove hate speech: Automatically delete messages that contain hate speech. You can configure it to allow “light profanity” if you don't want complete censorship, but want to keep your community moderated.

5. Keep a specialized chat on topic: Automatically delete messages that are off-topic. You can configure it to allow some specific topics that can be discussed, or topics that can't be discussed. IMPORTANT: The bot doesn't know the context, so it may falsely flag some messages as off-topic. It's a good idea only for specialized chats, such as job posting chats where you can only post vacancies or CVs, but not discuss them.

6. Offline moderation: If you're not always online, you can set up AI Moderator to delete messages that contain specific content, so you don't have to worry about it when you're not online. And when you're online, you can review the deleted messages and decide to return them, ban / mute the user, or not. You can enable and disable AI Moderator at any time, so when you're online, you can disable it and moderate the chat yourself.


How to use

First, send /start in DM of the bot:

Click “Tools for chats and communities”:

Click “Group Chat” (obviously, there's nothing to moderate in a channel). If you want to moderate the comment section of a channel, you have to click “Group chat” and select the chat that is linked to the channel.

Now, select the group and add the bot.

If you click “Permissions”, you will be able to set up who can change AI Moderator's settings. By default, it's admins who have permission to change group information (description, link, etc.)

To configure the bot, click “AI Moderator”:

When you click it for the first time, you will be asked a few questions about the group rules to build a prompt for the AI. Don't worry, you can always return to this step later, or set a custom prompt.

After this, you will see a big menu of the bot:

Explanation:

  • Prompt: this is what is sent to AI along with the message — the rules of your chat. Based on it, the AI will decide what to do with the message. Under the hood, the AI detects it as “Good” (does nothing), “MoreContextNeeded” (same as “Good”), “Inform” (by default, only deletes the message, that's for people who just joined and may not have read rules yet, but they may be real people, so don't ban them yet), “Suspicious” (by default, mutes the person for 15 minutes, the AI is not 100% sure that the message is bad), and “Harmful” (the AI is sure that the message is bad, bans the person by default).
  • Message that appears when a message is deleted: When AI deletes a message or bans a user, it will send a message in the chat to let the person know that the message violated the chat's rules. Adding a link to the rules in this message is a good practice. This message will auto-destroy itself in 1 minute.
  • Enter New Prompt: You can change the prompt, fully customize it as you want. You can customize it to flag certain rules as “Inform”, “Suspicious”, or “Harmful”, if you want to do different things for each rule.
  • Edit Prompt: Write a short instruction (e.g., “Allow all links”) and AI will generate a new prompt, based on your old prompt + your instruction.
  • Setup Prompt: Run the setup process again, with all these questions from above.
  • Mode: If it's “Testing”, it won't actually ban / mute people, will work even if you're an admin, and will bypass the “Check first X messages”.
  • Moderator Chat: That's where all logs will be sent. For example, “<someone> sent a message, and it was flagged as spam: ...”, with buttons “Ban”, “Unmute”, “Return message”, “Add exception”. If you don't have a chat with moderators, create one. When you set it up, the bot will send these messages in the main chat, which is not perfect, so you can't disable Testing mode until you set the Moderator Chat.
    NOTE: If you're using a mobile version of Telegram, the group may not show up, this is a Telegram bug. To make it show up, make it public (by adding @username) and then back to private.
  • Harmful, Sus, Inform: What to do when AI flags a message as Harmful, Suspicious, or Inform. Click the button to cycle possible values.
  • Set Message: Set the “Message that appears when a message is deleted” from above.
  • Sends deletion messages: If disabled, the Message from above won't show up, the bot will just silently delete the message.
  • Check X messages: To save some money and avoid banning OGs of your community, you can set it to moderate only the first message, first 3 messages, or any number of messages. If the user was not flagged in the first X messages, their next messages won't be checked.
  • Model: The AI that is used to moderate messages. Use “Recommended (Best)” unless you really know what you're doing.
  • Test: Send a test message and see if it would be flagged. That's useful when you're reiterating your prompt and want to make sure that it works.
  • Add Balance: AI Moderator has a free trial, but it's not free — you can buy additional messages for Telegram Stars.
  • Enabled / Disabled: That should be obvious.

Once, you've set up the settings and tested, make sure that the bot is enabled and let it detect some messages. We recommend not making it too harsh for the first couple days (set moderation actions to “Only Warn” instead of ban / mute), and once you're sure that the bot understands your rules and works as you want, you can set the actions back to ban / mute.

When the bot flags a message, it deletes it and sends a report in the Moderator Chat:

It will also send messages for non-spam, if you have Testing mode.

When you click “See Reason”, you will see something like this:

If you click “Add Exception”, you can modify the rules in a way that allows this message:

The bot will offer you some suggestions, which you can review and apply, but you can always just edit the prompt manually, if you want.

Non-AI features

Recommended (fast) model is already fast - achieving ~0.3 second latency on average. But if you need some easy-to-detect spam to be deleted even faster - we have some features that don't require AI at all, and don't burn your credits!

  1. Block mostly emoji: Spammers usually write their text in animated and colorful, sometimes animated letter custom Telegram Premium emojis to make it almost impossible to detect, since the bot can't look at each emoji's image, because it will take a lot of time. You can just tell AI to remove messages that contain mostly emojis in the prompt, but if you need speed, you can use this option. It's not enabled by default because there might be real people communicating in emojis.
  2. Block forwarded stories: At the time of writing, legit Telegram bots (not the unofficial user-bots that send spam) can't read or interact with stories that are forwarded in the chat, so they're impossible to moderate. But good news is that literally no one (among real users) is forwarding stories to public group chats, and even less people do this among their first 3 messages in the group, so if we just block everyone who is forwarding stories, there chance that we accidentally ban a human is really low, so this rule is enabled by default.

How to Write a Good Prompt

This is the most important part that affects bot's results, so pay attention to this:

The best prompt is a hand-crafted prompt, but we have a good default one. The prompt is the rules of your chat, based on which the bot decides to ban you. But there are some important differences, and here are my tips for setting up a good prompt:

  • AI Moderator doesn't have access to the previous messages (yet). So you should tell it to ban people only in very obvious cases, such as sending links to crypto airdrops, attempts to scam people, and other types of spam. If you get a lot of similar spam, consider adding “ban messages like this: <message text>”.
  • You probably don't want to disallow all links. There are cases where real people would send links to a third-party website like twitter, youtube, medium, or your own website. Try with something like “Promotions with links that are not related to <your chat name> are not allowed”, this should filter out most of the spam. There is also a separate setting for scams, which should filter out links to crypto airdrops and other scams. Disallowing all links should only be used as a last resort, as the risk of accidentally banning someone is high, and most of the time, you can filter out spam using other rules.
  • Same goes for rules that are not violated frequently. As a rule of thumb, if someone doesn't get banned once per day for some rule violation, it shouldn't be in the prompt, you can clear it yourself. For example, our beta tester chat, Near Validators, had “Price discussion is not allowed” in their prompt. It probably banned 0 people so far. But it had 1 false positive when someone was talking about seat price (seat is not a crypto, it's a different kind of price that should be allowed). If you don't want your chat to have price discussion, but don't expect this to be violated frequently, don't add this rule. Once a week, you can just manually tell 1 person that it's not allowed. I think letting 5 spammers get through for 30 minutes is better than having 1 person banned for nothing.
  • Don't use "Ban" unless you really need to, use "Mute" or "Mute for 15 minutes", it will have the same effect on the spammers, but if the AI accidentally bans someone, they won't have to join again after you unban them. Otherwise, they might just give up and decide not to join your chat after getting banned.
  • Edit your prompt after using the prompt builder. The default prompt is a good starting point, but in many cases, it's not specific enough. You can achieve much more if you give the AI a context of your chat. For example, in Near Validators, you can say “'Seat price' references to validator requirements, so discussion of seat prices is allowed”. If you want to ban links, give it a list of allowed links, for example, “Ban all links except for *.intear.tech”. If it's a chat for freelancers, tell it that sharing links to portfolios or companies that people worked with is allowed.
  • Under the hood, it asks ChatGPT to classify a message as Good, MoreContextNeeded, Inform, Suspicious, or Harmful. Good and MoreContextNeeded are combined into "Good". You can tell the AI to classify certain violations to certain categories, and give different punishments (see Inform, Sus, Harmful above). For example, "NSFW content is not allowed, mark it as Inform", and Inform sends a message (see Set Message above). If you tell ChatGPT to classify a scam as "Good", it won't do that because it's censored, even if you don't want scams to be handled by AI, that's why there is MoreContextNeeded, it'll act the same as Good and there's no separate setting.
  • AI Moderator is not a replacement for a human, it can help with 95% of the spam, but the other 5% may be too hard to identify, or be very context-specific. There are bots that simulate conversations using 2 accounts that send messages to each other, there are people that just send links with no text at all, which is not an effective promotion because no one knows what's there, but it'll be hard for the moderator to identify if it's spam or a legit link. Don't expect it to filter out 100% of the spam.

If you have any questions, send them in t.me/intearchat!









Report Page