Solved Swear Filter Problems

Discussion in 'Plugin Development' started by OTF Catastrophe, Dec 19, 2016.

Thread Status:
Not open for further replies.
  1. Offline

    OTF Catastrophe

    So I'm using this really simple swear filter that seems to work really good, but it works too good. It's a common issue with filter plugins to either have the basics and not go through a whole system to check the swear, or being so good that it tracks swears when a player wasn't intentionally trying to. What I mean by this is, I need help basically creating a whitelist for a plugin so you can say things like "assassin" but you can't say "ass". Any help is appreciated, and please feel free to just help me, you don't have to spoonfeed at all, I might end up asking more questions just to further understand. :)


    Code:
    @EventHandler
      public void onPlayerSwearingChat(AsyncPlayerChatEvent event)
      {
        Player player = event.getPlayer();
        List<String> curse = this.plugin.getConfig().getStringList("blacklistenWords");
        String message1 = event.getMessage().toLowerCase().replace(" ", "").replace("_", "")
          .replace(".", "").replace(",", "").replace("-", "").replace("=", "").replace("'", "").replace(";", "")
          .replace("/", "").replace("*", "").replace("!", "").replace("#", "").replace("%", "").replace("^", "")
          .replace("&", "").replace("(", "").replace("|", "").replace(")", "").replace("{", "").replace("}", "")
          .replace(":", "").replace("<", "").replace(">", "").replace("+", "").replace("@", "").replace("$", "");
        String message2 = event.getMessage().toLowerCase().replace(" ", "").replace("_", "")
          .replace(".", "").replace(",", "").replace("-", "").replace("=", "").replace("'", "").replace(";", "")
          .replace("/", "").replace("*", "").replace("!", "").replace("#", "").replace("%", "").replace("^", "")
          .replace("&", "").replace("(", "").replace("|", "").replace(")", "").replace("{", "").replace("}", "")
          .replace(":", "").replace("<", "").replace(">", "").replace("+", "").replace("@", "a").replace("$", "s");
        for (String word : curse) {
          if ((message1.contains(word)) || (message2.contains(word)))
          {
            event.setCancelled(true);
            player.sendMessage(ChatColor.DARK_RED + "Swearing is not allowed in this server!");
            break;
          }
        }
      }
     
  2. Offline

    timtower Administrator Administrator Moderator

    @OTF Catastrophe May I suggest using an array for all those characters and loop over them?
    And that example is easy because you can just filter on "ass", if the word is the same as it
     
  3. Offline

    OTF Catastrophe

    Yeah an array would look a little nicer aha. But the issue with the filter is if "ass" is blacklisted then it still tracks it in "assassin". I'm assuming you mean trying to get it to search and see if the word is exactly equal to "ass" but the point of grabbing message1 and message2 is to try to filter out as much as possible. And even if I simply just checked one word, looking for exacts wouldn't be efficient as players could just type "asss" and it'd bypass it since it doesn't equal "ass".

    Unless you mean something different entirely, if so could you explain a little more?
     
  4. Offline

    timtower Administrator Administrator Moderator

    @OTF Catastrophe No, that is exactly what I mean.
    Didn't see that message1 and message2 have different replaces at the end either.

    How about a percentage match? ass would match for 75% in asss
    But much lower in assassin
    Letters need to occur in that order.
     
  5. Offline

    OTF Catastrophe

    I honestly didn't even notice that, even with the tiny bit of code that it is aha, I actually have to make both the same so I can still do the array.

    Percentages could work but before I try any of that, is there a regex method related to a whitelist that could help in this situation? I'm not sure if it's even a thing honestly but it's worth asking aha.
     
  6. Offline

    timtower Administrator Administrator Moderator

  7. @OTF Catastrophe
    There is a really convenient method for measuring how different two strings are, which is incredibly useful in these sorts of swear filters. It's called the Levenshtein algorithm. Here's the wikipedia page (it even contains a code implementation, although in C!):
    https://en.wikipedia.org/wiki/Levenshtein_distance

    EDIT: one thing I forgot to mention is that you'll probably have to add an exception list which you can add words to, since words like "duck" is a swear word if you replace the d with an f.
     
    Last edited: Dec 19, 2016
  8. Offline

    OTF Catastrophe

    I've seen videos on YouTube about it I believe and coincidentally it was actually someone showing off their language filter comparing words to each language. It was a self taught program which was really interesting. The only issue is I honestly have 0 knowledge of C and working all that for a swear filter that I'll have to add large amounts of exceptions might be a lot of un-needed work.

    I might end up just making a plugin where the filter gets the basics of what it needs and let it settle with that instead of trying to make a whitelist. My thought process is no matter how much work goes into it or how much time is spent, theres always going to be a way to bypass the filter aha.

    Thank you @timtower for your help and suggestions and thank you @AlvinB for your suggestion as well. :)
     
    timtower likes this.
  9. @OTF Catastrophe
    Well, if you know java, you can quite easily understand that bit of code, so I figured that it was in C wasn't a problem :p

    And I agree, most of these swear filters get quite nasty to setup quite quickly, which is why most server owners don't even bother with them. Humans are simply more effective.

    And to add on to that, the severity of most swear words depend on context, so unless you've got a program that can understand language context (not likely to happen soon), you're going to have to stick with a rather basic filter.
     
    OTF Catastrophe likes this.
Thread Status:
Not open for further replies.

Share This Page