PSA: Lemmy votes can be manipulated

koper@feddit.nl · 1 year ago

PSA: Lemmy votes can be manipulated

PetrichorBias@lemmy.one · edit-2 1 year ago

This was a problem on reddit too. Anyone could create accounts - heck, I had 8 accounts:

one main, one alt, one “professional” (linked publicly on my website), and five for my bots (whose accounts were optimistically created, but were never properly run). I had all 8 accounts signed in on my third-party app and I could easily manipulate votes on the posts I posted.

I feel like this is what happened when you’d see posts with hundreds / thousands of upvotes but had only 20-ish comments.

There needs to be a better way to solve this, but I’m unsure if we truly can solve this. Botnets are a problem across all social media (my undergrad thesis many years ago was detecting botnets on Reddit using Graph Neural Networks).

Fwiw, I have only one Lemmy account.

simple@lemmy.world · 1 year ago

Reddit had ways to automatically catch people trying to manipulate votes though, at least the obvious ones. A friend of mine posted a reddit link for everyone to upvote on our group and got temporarily suspended for vote manipulation like an hour later. I don’t know if something like that can be implemented in the Fediverse but some people on github suggested a way for instances to share to other instances how trusted/distrusted a user or instance is.

cynar@lemmy.world · 1 year ago

An automated trust rating will be critical for Lemmy, longer term. It’s the same arms race as email has to fight. There should be a linked trust system of both instances and users. The instance ‘vouches’ for the users trust score. However, if other instances collectively disagree, then the trust score of the instance is also hit. Other instances can then use this information to judge how much to allow from users in that instance.

fmstrat@lemmy.nowsci.com · 1 year ago

This will be very difficult. With Lemmy being open source (which is good), bot maker’s can just avoid the pitfalls they see in the system (which is bad).

hawkwind@lemmy.management · 1 year ago

LLM bots has make this approach much less effective though. I can just leave my bots for a few months or a year to get reputation, automate them in a way that they are completely indistinguishable from a natural looking 200 users, making my opinion carry 200x the weight. Mostly for free. A person with money could do so much more.

cynar@lemmy.world · 1 year ago

It’s the same game as email. An arms race between spam detection, and spam detector evasion. The goal isn’t to get all the bots with it, but to clear out the low hanging fruit.

In your case, if another server noticed a large number of accounts working in lockstep, then it’s fairly obvious bot-like behaviour. If their home server also noticed the pattern and reports it (lowers the users trust rating) then it wont be dinged harshly. If it reports all is fine, then it’s also assumed the instance might be involved.

If you control the instance, then you can make it lie, but this downgrades the instance’s score. If it’s someone else’s, then there is incentive not to become a bot farm, or at least be honest in how it reports to the rest.

This is basically what happens with email. It’s FAR from perfect, but a lot better than nothing. I believe 99+% of all emails sent are spam. Almost all get blocked. The spammers have to work to get them through.

TWeaK@lemm.ee · 1 year ago

RIP u/unidan

ඞmir@lemmy.ml · 1 year ago

I miss everyone being Unidan

70ms@lemmy.world · edit-2 1 year ago

I got suspended multiple times because my partner and daughter were also in our city’s sub, and sometimes one of them would upvote my comments without realizing it was me. It got really fucking annoying, and of course there’s no way to talk to a real person at reddit to prove we’re different people. I’d appeal every time and they’d deny it every time. How reddit could have gotten so huge without realizing that multiple people can live in the same household is beyond me. In the end they both just stopped upvoting anything in the sub because it was too risky (for me).

TheSaneWriter@lemm.ee · 1 year ago

Hearing that, I wonder if they were using an IP address based system. That would cause real problems for people using a VPN, but it wouldn’t surprise me.

Derproid@sh.itjust.works · 1 year ago

That’s such a hilariously bad metric for detecting a bot network too. It wouldn’t even work to detect a real one, so all that policy ever did was annoy real users.

Thorny_Thicket@sopuli.xyz · 1 year ago

I got that message too when switching accounts to vote several times. They can probably see it’s all coming from the same ip.

PeleSpirit@lemmy.world · 1 year ago

I think it’s the 3rd party app or VPN thing that would have saved your friend.

esty@lemmy.ca · 1 year ago

nope, i tried manipulating votes from apollo once and got a warning

PeleSpirit@lemmy.world · 1 year ago

Were you on a VPN?

esty@lemmy.ca · 1 year ago

nope, so that’s probably it

Dandroid@dandroid.app · 1 year ago

If you and several other accounts all upvoted each other from the same IP address, you’ll get a warning from reddit. If my wife ever found any of my comments in the wild, she would upvoted them. The third time she did it, we both got a warning about manipulating votes. They threatened to ban both of our accounts if we did it again.

But here, no one is going to check that.

BrianTheeBiscuiteer@lemmy.world · 1 year ago

Yes, I feel like this is a moot point. If you want it to be “one human, one vote” then you need to use some form of government login (like id.me, which I’ve never gotten to work). Otherwise people will make alts and inflate/deflate the “real” count. I’m less concerned about “accurate points” and more concerned about stability, participation, and making this platform as inclusive as possible.

PetrichorBias@lemmy.one · edit-2 1 year ago

In my opinion, the biggest (and quite possibly most dangerous) problem is someone artificially pumping up their ideas. To all the users who sort by active / hot, this would be quite problematic.

I’d love to actually see some social media research groups actually consider how to detect and potentially eliminate this issue on Lemmy, considering Lemmy is quite new and is malleable at this point (compared to other social media). For example, if they think metric X may be a good idea to include in all metadata to increase chances of detection, then it may be possible to include this in the source code of posts / comments / activities.

I know a few professors and researchers who do research on social media and associated technologies, I’ll go talk to them when they come to their office on Monday.

theolodger@feddit.uk · edit-2 1 year ago

!remindme - oh wait…

Lumidaub@feddit.de · 1 year ago

@[email protected] 1 day

:)

Remind Me@mstdn.social · 1 year ago

@Lumidaub Ok, I will remind you on Monday Jul 10, 2023 at 9:36 AM PDT.

zuhayr@lemmy.world · 1 year ago

I have been thinking about this government id aspect too. But it’s not coming to me.

Users sign up with govt ID, obtain a unique social media key that’s used for all activities beyond the sign up. One key per person, but a person can have multiple accounts? You know, like that database primary key.

The relationship between the govt id and social media key needs to be in a zero knowledge encryption so that no one can corelate the real person with their online presence. THIS is the bummer.

BrianTheeBiscuiteer@lemmy.world · 1 year ago

This also vaguely reminds me of some advanced networking topics. In mesh networks there is the possibility of rogue nodes causing havoc and different methods exist to reduce their influence or cut them out of the process.

SrElsewhere@lemmy.world · 1 year ago

These downvotes indicate that some of the assholes have now migrated.

MigratingtoLemmy@lemmy.world · 1 year ago

Congratulations on such a tough project.

And yes, as long as the API is accessible somebody will create bots. The alternative is far worse though

vis4valentine@lemmy.ml · 1 year ago

I have like tens of accounts on reddit.

Thorny_Thicket@sopuli.xyz · 1 year ago

I always had 3 or 4 reddit accounts in use at once. One for commenting, one for porn, one for discussing drugs and one for pics that could be linked back to me (of my car for example) I also made a new commenting account like once a year so that if someone recognized me they wouldn’t be able to find every comment I’ve ever written.

On lemmy I have just two now (other is for porn) but I’m probably going to make one or two more at some point

authed@lemmy.ml · 1 year ago

I have about 20 reddit accounts… I created/ switched account every few months when I used reddit

Takatakatakatakatak@lemmy.dbzer0.com · 1 year ago

I don’t know how you got away with that to be honest. Reddit has fairly good protection from that behaviour. If you up vote something from the same IP with different accounts reasonably close together there’s a warning. Do it again there’s a ban.

PetrichorBias@lemmy.one · 1 year ago

I did it two or three times with 3-5 accounts (never all 8). I also used to ask my friends (N=~8) to upvote stuff too (yes, I was pathetic) and I wasn’t warned/banned. This was five-six years ago.

Hexorg@beehaw.org · 1 year ago

I think the best solution there is so far is to require captcha for every upvote but that’d lead to poor user experience. I guess it’s the cost benefit of user experience degrading through fake upvotes vs through requiring captcha.

Catsrules@lemmy.ml · 1 year ago

I could see this being useful on a per community basis. Or something that a moderator could turn on and off.

For example on a political or news community during an election. It might be worth while to turn captcha on.

magnetosphere @beehaw.org · 1 year ago

If any instance ever requires a captcha for something as trivial as an upvote, I’ll simply stop upvoting on that instance.

Hexorg@beehaw.org · 1 year ago

Yes that’s what I meant by degrading user experience

ඞmir@lemmy.ml · 1 year ago

It wouldn’t stop bots because they would just use any instance without the captcha

AndrewZabar@beehaw.org · 1 year ago

On Reddit there were literally bot armies by which thousands of votes could be instantly implemented. It will become a problem if votes have any actual effect.

It’s fine if they’re only there as an indicator, but if the votes are what determine popularity, prioritize visibility, it will become a total shitshow at some point. And it will be rapid. So yeah, better to have a defense system in place asap.

Puph@lemmy.dbzer0.com · 1 year ago

I had all 8 accounts signed in on my third-party app and I could easily manipulate votes on the posts I posted.

There’s no chance this works. Reddit surely does a simple IP check.

Salamander@mander.xyz · 1 year ago

I would think that they need to set a somewhat permissive threshold to avoid too many false positives due to people sharing a network. For example, a professor may share a reddit post in a class with 600 students with their laptops connected to the same WiFi. Or several people sharing an airport’s WiFi could be looking at /r/all and upvoting the top posts.

I think 8 accounts liking the same post every few days wouldn’t be enough to trigger an alarm. But maybe it is, I haven’t tried this.

Valmond@lemmy.ml · 1 year ago

I had one main account but also a couple for using when I didn’t want to mix my “private” life up with other things. I don’t even know if it’s not allowed in the TOS?

Anyway, I stupidly made a Valmond account on several Lemmy instances before I got the hang of it, and when (if!) my server will one day function I’ll make an account there so …

I guess it might be like in the old forum days, you have a respectable account and another if you wanted to ask a stupid question etc. admin would see (if they cared) but not the ordinary users.

averyminya@beehaw.org · 1 year ago

Reddit will definitely send you PM’s for vote manipulation

Andy@lemmy.world · 1 year ago

I’m curious what value you get from a bot? Were you using it to upvote your posts, or to crawl for things that you found interesting?

PetrichorBias@lemmy.one · edit-2 1 year ago

The latter. I was making bots to collect data (for the previously-mentioned thesis) and to make some form of utility bots whenever I had ideas.

I once had an idea to make a community-driven tagging bot to tag images (like hashtags). This would have been useful for graph building and just general information-lookup. Sadly, the idea never came to fruition.

Andy@lemmy.world · 1 year ago

Cool, thank you for clarifying!

InternetPirate@lemmy.fmhy.ml · edit-2 1 year ago

I feel like this is what happened when you’d see posts with hundreds / thousands of upvotes but had only 20-ish comments.

Nah it’s the same here in Lemmy. It’s because the algorithm only accounts for votes and not for user engagement.

AndrewZabar@beehaw.org · 1 year ago

Yeah votes are the worst metric to measure anything because of bot voters.

impulse@lemmy.world · 1 year ago

I see what you mean, but there’s also a large number of lurkers, who will only vote but never comment.

I don’t think it’s unfeasible to have a small number of comments on a highly upvoted post.

SGforce@lemmy.ca · 1 year ago

If it’s a meme or shitpost there isn’t anything to talk about

PetrichorBias@lemmy.one · 1 year ago

Maybe you’re right, but it just felt uncanny to see thousands of upvotes on a post with only a handful of comments. Maybe someone who active on the bot-detection subreddits can pitch in.

RedCowboy@lemmy.world · 1 year ago

I agree completely. 3k upvotes on the front page with 12 comments just screams vote manipulation

randomname01@feddit.nl · 1 year ago

True, but there were also a number of subs (thinking of the various meirl spin-offs, for example) that naturally had limited engagement compared to other subs. It wasn’t uncommon to see a post with like 2K upvotes and five comments, all of them remarking how little comments there actually were.

AndrewZabar@beehaw.org · 1 year ago

May I ask how do you format your text? My format bar has disappeared from wefwef.

PetrichorBias@lemmy.one · edit-2 1 year ago

I don’t use wefwef, I use jerboa for android.

**bold**

*italics*

> quote

`code`

# heading

- list

AndrewZabar@beehaw.org · edit-2 1 year ago

Ah ok. Yeah I thought the markdown was the same as reddit being markdown but it used to have a toolbar.

Thanks for response.

Also I’ve wondered why don’t they have an underline markdown.

TWeaK@lemm.ee · edit-2 1 year ago

Fun fact: old reddit used to use one of the header functions as an underline. I think it was 5x # that did it. However, this was an unofficial implementation of markdown, and it was discarded with new reddit. Also, being a header function you could only apply it to an entire line or paragraph, rather than individual words.

🐱TheCat@sh.itjust.works · 1 year ago

IMO the best way to solve it is to ‘lower the stakes’ - spread out between instances, avoid behaviors like buying any highly upvoted recommendation without due diligence etc. Basically, become ‘un-advertiseable’, or at least less so

FartsWithAnAccent@lemmy.world · 1 year ago

I’d just make new usernames whenever I thought of one I thought was funny. I’ve only used this one on Lemmy (so far) but eventually I’ll probably make a new one when I have one of those “Oh shit, that’d be a good username” moments.

Azzu@lemm.ee · 1 year ago

You can change your display name on Lemmy to whatever you want whenever you want.

FartsWithAnAccent@lemmy.world · 1 year ago

Oh neat! Thanks!

Rearsays@lemmy.ml · 1 year ago

I would imagine this is the same with bans I imagine there will be a future reputation watchdog set of servers which might be used over this whole everyone follows the same modlog. The concept of trust everyone out of the gate seems a little naive

sparr@lemmy.world · 1 year ago

Web of trust is the solution. Show me vote totals that only count people I trust, 90% of people they trust, 81% of people they trust, etc. (0.9 multiplier should be configurable if possible!)

nekat_emanresu@lemmy.ml · edit-2 1 year ago

Love that type of solution.

I’ve been thinking about an admin that votes on example posts to define the policy, and then getting users scored against it, then using high scorers to represent user copies of the admins spirit of moderation, and then make systems that use that for automoderation.

e.g. I vote yes, no, yes. I then run the script that checks my users that have voted in all three, and the ones with the highest matching votes that i define(must be 100% matching to my votes) gets counted as “matching my spirit of moderation”. If a spirit of moderation user downvotes or reports then it can be auto flagged into an admin console for me to then rapidly view instead of sifting through user complaints, and if things get critically spicy i can promote them to emergency mods, or automate their reports so that if a spirit user and a random user both report, it gets auto removed.

CanadianNomad@lemmy.world · edit-2 1 year ago

deleted by creator

OsrsNeedsF2P@lemmy.ml · 1 year ago

Fwiw, search engines need to figure out what is “reliable”. The original implementations were, well if BananaPie.com is referenced by 10% of the web, it must be super trustworthy! So people created huge networks of websites that all linked each other and a website they wanted to promote in order to gain reliability.

interdimensionalmeme@lemmy.ml · 1 year ago

Your client has to compute the raw data, not the server or else it will just be your server manipulating what you see and think.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

interdimensionalmeme@lemmy.ml · 1 year ago

Client must computer all raw data. All individual moderation action (vote,block, subscribe) would be made public by default and stealth optional.

Only user led moderation has a future, it all has to be transparent, public, client sided, optional and consensual

sparr@lemmy.world · 1 year ago

It could be implemented on both the server and the client, with the client trusting the server most of the time and spot checking occasionally to keep the server honest.

The origins of upvotes and downvotes are already revealed on objects on Lemmy and most other fediverse platforms. However, this is not an absolute requirement; there are cryptographic solutions that allow verifying vote aggregation without identifying vote origins, but they are mathematically expensive.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

interdimensionalmeme@lemmy.ml · 1 year ago

It’s nothing. You don’t recompute everything for each page refresh. Your sucks well the data, compute reputation total over time and discard old raw data when your local cache is full.

Historical daily data gets packaged, compressed, and cross signed by multiple high reputation entities.

When there are doubts about a user’s history, your client drills down those historical packages and reconstitute their history to recalculate their reputation

Whenever a client does that work, they publish the result and sign it with their private keys and that becomes a web of trust data point for the entire network.

Only clients and the network matter, servers are just untrustworthy temporary caches.

Opafi@feddit.de · 1 year ago

Any solution that only works because the platform is small and that doesn’t scale is a bad solution though.

sugar_in_your_tea@sh.itjust.works · 1 year ago

That sounds a bit hyperbolic.

You can externalize the web of trust with a decentralized system, and then just link it to accounts at whatever service you’re using. You could use a browser extension, for example, that shows you whether you trust a commenter or poster.

That list wouldn’t get federated out, it could live in its own ecosystem, and update your local instance so it provides a separate list of votes for people in your web of trust. So only your admin (which could be you!) would know who you trust, and it would send two sets of vote totals to your client (or maybe three if you wanted to know how many votes it got from your instance alone).

So no, I don’t think it needs to be invasive at all.

SQL_InjectMe@partizle.com · 1 year ago

What if the web of trust is calculated with upvotes and downvotes? We already trust server admins to store those.

sugar_in_your_tea@sh.itjust.works · 1 year ago

I think that could work well. At the very least, I want the feature where I can see how many times I’ve upvoted/down voted a given individual when they post.

That wouldn’t/shouldn’t give you transitive data imo, because voting for something doesn’t mean you trust them, just that the content is valuable (e.g. it could be a useful bot).

shagie@programming.dev · edit-2 1 year ago

deleted by creator

sugar_in_your_tea@sh.itjust.works · 1 year ago

My point is you can have a mixed system. For example:

server stores list of “special interest” users (followed users, WoT, mods, etc)
server stores who voted for what (already does)
client updates the server’s list of “special interest” users with WoT data
when retrieving metadata about a post, you’d get:
- total votes
- votes from “special interest” users
- total votes from your instance

That’s not a ton of data, and the “special interest” users wouldn’t need to be synchronized to any other instance. The client would store the WoT data and update the server as needed (this way the server doesn’t need any transitive logic, the client handles it).

Zeppo@sh.itjust.works · 1 year ago

Facebook and Twitter have always had their equivalent of upvotes be public.

rDrDr@lemmy.world · 1 year ago

This was a great feature of reddit enhancement suite.

bloodfart@lemmy.ml · 1 year ago

Get rid of votes. They suck.

🦄🦄🦄@feddit.de · 1 year ago

Nah, I want to downvote Nazis. Their opinions don’t matter and should be suppressed.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

deleted by creator

bloodfart@lemmy.ml · 1 year ago

Suppress nazis by bullying them, not by passively downvoting their hate speech and moving on.

🦄🦄🦄@feddit.de · 1 year ago

You can do both.

bloodfart@lemmy.ml · 1 year ago

I guess. Ones really effective and tells everyone around you that the person is a nazi in case they were cloaking it, pushes back on their bullshit and makes everyone aware that it’s not okay to say shit like that and that it is okay to fight them.

The other is a downvote and changes where the nazi content ends up in a rank.

nekat_emanresu@lemmy.ml · 1 year ago

Nazis will always act in bad faith and it shouldn’t surprise us when they use their 10 alts to fuck up voting, which is another reason to hide votes and focus on commenting rather than voting. Although i don’t agree with the negative style of confrontation, the positive and neutral are great though. Commentate on each bad faith action they take in real time so the audience understands how stupid nazis are, and becomes resistant to bad faith tactics.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

bloodfart@lemmy.ml · 1 year ago

Too bad activitypub is carved in stone and can’t be changed for any reason.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

bloodfart@lemmy.ml · 1 year ago

You can take the guy out of Reddit…

My reply to you was sarcasm. Specifications can be changed. Things can be removed from them.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

bloodfart@lemmy.ml · 1 year ago

Yeah, it would break all those things for which votes provide no benefit. They should be broken.

There isn’t any use for votes on a platform that isn’t using them to automatically rank content for the purposes of profit.

Running a voteless instance of one of those does no good because the problem is structural. Votes aren’t secure and their whole purpose is to manipulate what content gets shown to users. People using the votes to make something get shown (or not) isn’t a bug, it’s a feature.

The existence of a system that ranks content according to votes changes how people behave on the platform. Spinning up an instance that just doesn’t allow or show votes doesn’t change the problem that all the content is produced using the vote system and reflects it.

shagie@programming.dev · edit-2 1 year ago

deleted by creator

Skull giver@popplesburger.hilciferous.nl · 1 year ago

Arriving and Departing are also part of ActivityPub but I don’t see anyone sharing their travels here. It’s up to the server software or instance itself to determine what kind of ActivityStreams activities are supported or not.

In fact, there are explicit Like and Dislike activities, yet some software chooses to implement upvotes through boosts/favourites. Both are valid options in my opinion, especially if interoperability with tools like Mastodon are desired, but this goes to show how you can easily deviate from the spec.

If you really want, you can make a perfectly compliant ActivityPub server without any kind of text content. I can imagine a Strava-like app sending location updates and using departure/arrival to mark the start and end of a run, for example.

Disabling votes and relying purely on comment count wouldn’t be that hard. On the other hand, comments can be generated automatically by copying them from other articles/servers and users can be generated on the fly, so that wouldn’t solve anything, it’d just make it slightly harder to deal with the issue.

rektangel@lemmy.world · 1 year ago

How else would you rank the content?

bloodfart@lemmy.ml · 1 year ago

You don’t. Ranked content is a solution for owners of social media platforms to avoid paying moderators. It’s a no brainer if you want a cheap automatic advertising platform but isn’t great and requires constant intervention if you’re not monetizing somehow.

mintyfrog@lemmy.ml · 1 year ago

PSA: internet votes are based on a biased sample of users of that site and bots

7heo@lemmy.ml · edit-2 1 year ago

expired

TheGreatHerald@sh.itjust.works · edit-2 1 year ago

deleted by creator

kolorafa@lemmy.world · 1 year ago

This would be rather to detect and alert admin of a bad actors (instances) and then admin can kick it off from federation same for other tupe of offences.

7heo@lemmy.ml · 1 year ago

This could become a problem on posts only relevant on one server

Obviously, on the server the posts are from, you display the full vote count. There, the admins know the accounts, can vet them, etc.

nekat_emanresu@lemmy.ml · 1 year ago

Interesting idea.

SQL_InjectMe@partizle.com · 1 year ago

Small instances are cheap, so we need a way to prevent 100 bot instances running on the same server from gaming this too

7heo@lemmy.ml · edit-2 1 year ago

expired

Skull giver@popplesburger.hilciferous.nl · 1 year ago

How would you prevent someone using wildcard domains from spamming servers the same way they can spam clients? The Fediverse has no way to distinguish between subdomains and normal domains. Anyone running an instance through classic DDNS would be affected by this.

The approach could work, but it would invalidate some major assumptions in the Fediverse itself. The algorithm would also need to make sure a few single user instances don’t get to sway entire servers.

SkyNTP@lemmy.ml · 1 year ago

So far, the majority of content that approaches spam I’ve come across on Lemmy has been posts on [email protected] which highlight an issue attributed to the fediverse, but which ultimately have a corollary issue on centralised platforms.

Obviously there are challenges to address running any user-content hosting website, and since Lemmy is a comminity-driven project, it behooves the community to be aware of these challenges and actively resolve them.

But a lot of posts, intentionally or not, verge on the implication that the fediverse uniquely has the problem, which just feeds into the astroturfing of large, centralized media.

czarrie@lemmy.world · 1 year ago

The nice things about the Federated universe is that, yes, you can bulk create user accounts on your own instance - and that server can then be defederated by other servers when it becomes obvious that it’s going to create problems.

It’s not a perfect fix and as this post demonstrated, is only really effective after a problem has been identified. At least in terms of vote manipulation from across servers, it could act if it, say, detects that 99% of new upvotes are coming from a server created yesterday with 1 post, it could at least flag it for a human to review.

two_wheel2@lemm.ee · 1 year ago

It actually seems like an interesting problem to solve. Instance runners have the sql database with all the voting record, finding manipulative instances seems a bit like a machine learning problem to me

Pleonasm@programming.dev · 1 year ago

There’s an XKCD for that: https://xkcd.com/810/

flux@lemmy.ml · 1 year ago

One other thing is that you can bulk create your own instances, and that’s a lot more effort to defederate. People could be creating those instances right now and just start using them after a year; at least they have incurred some costs during that…

I believe abuse management in openly federated systems (e.g. Lemmy, Mastodon, Matrix) is still an unsolved problem. I doubt good solutions will arrive before they become popular enough to attract commercial spammers.

AeroSoap@lemm.ee · edit-2 1 year ago

deleted by creator

Black_Gulaman@lemmy.dbzer0.com · 1 year ago

Then they will just distribute their bots equally to other legit servers, and by that, defederation is not a viable solution anymore.

One other problem are real human troll farms

bdonvr@thelemmy.club · 1 year ago

If they can do that, they could’ve done it on a traditional site anyway

DigitalJacobin@lemmy.ml · 1 year ago

“Legit” instances are able to moderate/control the spam coming from their users.

pingveno@lemmy.ml · 1 year ago

I wonder if there’s a machine learning technique that can be used to detect bot-laden instances.

Sean Tilley@lemmy.ml · 1 year ago

Honestly, thank you for demonstrating a clear limitation of how things currently work. Lemmy (and Kbin) probably should look into internal rate limiting on posts to avoid this.

I’m a bit naive on the subject, but perhaps there’s a way to detect “over x amount of votes from over x amount of users from this instance”? and basically invalidate them?

jochem@lemmy.ml · 1 year ago

How do you differentiate between a small instance where 10 votes would already be suspicious vs a large instance such as lemmy.world, where 10 would be normal?

I don’t think instances publish how many users they have and it’s not reliable anyway, since you can easily fudge those numbers.

Sean Tilley@lemmy.ml · 1 year ago

10 votes within a minute of each other is probably normal. 10 votes all at once, or microseconds of each other, is statistically less likely to happen.

I won’t pretend to be an expert on the subject, but it seems like it’s mathematically possible to set some kind of threshold? If a set percent of users from an instance are all interacting microseconds from each other on one post locally, that ought to trigger a flag.

Not all instances advertise their user counts accurately, but they’re nevertheless reflected through a NodeInfo endpoint.

CybranM@feddit.nu · 1 year ago

Surely the bot server can just set up a random delay between upvotes to circumvent that sort of detection

milicent_bystandr@lemmy.ml · 1 year ago

I wonder if it’s possible …and not overly undesirable… to have your instance essentially put an import tax on other instances’ votes. On the one hand, it’s a dangerous direction for a free and equal internet; but on the other, it’s a way of allowing access to dubious communities/instances, without giving them the power to overwhelm your users’ feeds. Essentially, the user gets the content of the fediverse, primarily curated by the community of their own instance.

🐱TheCat@sh.itjust.works · 1 year ago

when you say import tax do you mean actual monetary payment? Or a computing power tax? I don’t think I understand

Taako_Tuesday@lemmy.ca · 1 year ago

I was reading it as lowering the value of an upvote from instances that are known to harbor click farming accounts. I could be wrong though.

zuhayr@lemmy.world · 1 year ago

Creating a foreign exchange for upvotes? 1 upvote from lemmy.world account = 25 upvotes from acconamatta.basementlemmy?

manucode@infosec.pub · 1 year ago

Maybe adjust by the number of upvotes coming from that instance (negatively) and by the number of upvotes users of your instance give over their (positively). If one instance spams upvotes, these upvotes loose value. If posts on that instance are popular with your users, the upvotes coming from that instance are more likely to have been made by real users. Maybe we can find a better metric to estimate the number of real, active users on another instance.

nutomic@lemmy.ml · edit-2 1 year ago

Sounds interesting, imilar to the way googles page rank works.

lemming007@lemm.ee · edit-2 1 year ago

That defeats the purpose of decentralization and creates a dangerous precedent. The entire point of Lemmy is that every instance is equally valid and legitimate. If certain instances are elevated above others, we’re on our way to do what Gmail and Microsoft did to email.

milicent_bystandr@lemmy.ml · 1 year ago

So, I didn’t mean instances treated unequally in the grand, set-in-protocol scheme of the fediverse - as if some centralised authority/agreement that this instance counts for more than that. Just as defederation doesn’t make meta’s instance authoritatively illigitimate.

But an instance can choose, within that instance, to defederate with another; likewise an instance within itself could deprioritise some or all others’ instances’ votes.

Still agree dangerous precedent …but still wonder if some sort of instance-controlled moderation of external content is eventually necessary in the future. Or, I suppose, there could be separate services (much like ad-block lists) that users individually could enable to auto-moderate/adjust their own feeds.

And (sorry for waffling!) I suppose it depends a lot on how much you browse specific communities and how much you scroll “all” or whatever. Back in the before-days, I’m used to subbing to very few communities, and generally lazily browsing r/all

milicent_bystandr@lemmy.ml · 1 year ago

Out of interest, within a community (that’s what a sublemmy is called, right?) is there any facility to prioritise votes of people subscribed to that community over those not subscribed? Was that the thing with brigading before (sorry, didn’t realise this before!) that mods can moderate and ban posts/posters but not votes/voters?

milicent_bystandr@lemmy.ml · 1 year ago

I agree it would be a dangerous precedent.

Thing is, though, every instance is not equally valid and legitimate: that’s the reason for defederating from Threads.

Not sure what you mean by what Gmail and Microsoft did to email? Do you mean that they assume many unknown email origins are spam? Though Gmail’s obviously attracted a lot of users, and I myself have moved off it now to paying for my email provider elsewhere, I was under the impression it’s been quite good for email and for pushing secure email, and being good at anti-spam.

lemming007@lemm.ee · 1 year ago

I mean that Microsoft and Gmail took over the email protocol and right now if you stand up your own email server with a new domain/IP you basically have zero chance to get your mail delivered anywhere. They’ve positioned themselves as “higher” authority because of the sheer number of users they control and can now control the entire email system.

Same thing could happen with instances if we elevate lemme.world or any other instance to be “more legitimate” so their user votes count higher.

Dodecahedron December@sh.itjust.works · 1 year ago

Uh no. Just implement DKIM if your messages are not being sent correctly. Spam is killing email, making admins implement more protocols such as DKIM but that isn’t “google and Microsoft killing email”

milicent_bystandr@lemmy.ml · edit-2 1 year ago

Yeah, that’s the idea

Edit: but I was thinking the result to be specific to your instance, rather than a fediverse-wide vote-rank standardisation.

So, e.g. to a viewer signed into lemmy.ml votes from within lemmy.ml would count more; but to the member of ispamlemmywithhate.crap, votes from ispamlemmywithhate.crap would count more

Skull giver@popplesburger.hilciferous.nl · 1 year ago

deleted by creator

lemming007@lemm.ee · 1 year ago

What is the definition of a “fake account”?

NightAuthor@beehaw.org · edit-2 1 year ago

Are you an academic or just dense?

SafetyGoggles@feddit.de · 1 year ago

@[email protected]

Are you an academic or just dense?

I thought beehaw is all about inclusivity and safe space and friendly shit

NightAuthor@beehaw.org · edit-2 1 year ago

Who says I’m not being inclusive. If I want to provide a helpful answer to the question, I must know what perspective they’re asking from.

navigatron@beehaw.org · 1 year ago

Let’s go academic with it, and skip straight past “impossible to answer” directly to heuristic / attribute analysis.

What are the attributes / behaviors / tells of a fake account?

hawkwind@lemmy.management · 1 year ago

In this context it would be an account with the sole purpose of boosting the visible popularity of a post or comment.

lemming007@lemm.ee · edit-2 1 year ago

But that’s kinda the point of all posts. You post because you want people to see something and you want your post to be popular so it can be seen by the largest amount of people.

hawkwind@lemmy.management · edit-2 1 year ago

Your right. You just asked what a “fake account” was though. I think it’s generally accepted that if you create “alt” accounts for the sole purpose of vote manipulation, you’re being a dick.

lemming007@lemm.ee · 1 year ago

Why am I being a dick, I was genuinely curious. What do you mean “vote manipulation”? Like making a post with one account and creating another one to upvote the post?

hawkwind@lemmy.management · 1 year ago

I didn’t mean YOU are being a dick. If SOMEONE creates “alt” accounts for the sole purpose of vote manipulation, they’re being a dick. I was using the royal “you,” a weird english language thing. You, yourself, are not a dick. We’ll you might be, but I don’t think so.

lemming007@lemm.ee · 1 year ago

Sorry, I misunderstood. I definitely agree accounts created for the sole purpose of upvoting stuff/bot farms are bad. I just don’t know if there’s an effective way to fight it as they’re getting pretty elaborate these days and it’s hard to distinguish them from real accounts.

Pretty soon we’ll be at the point where no one will trust anything on the Internet.

Thoralf Will@discuss.tchncs.de · 1 year ago

People may not like it but a reputation system could solve this. Yes, it’s not the ultimate weapon and can surely be abused itself.

But it could help to prevent something like this.

How could it work? Well, each server could retain a reputation score for each user it knows. Every up- or downvote is then modified by this value.

This will not solve the issue entirely, but will make it less easy to abuse.

patatahooligan@lemmy.world · 1 year ago

Ok, but what would the reputation score be based on that can’t be manipulated or faked?

badcommandorfilename@lemmy.world · 1 year ago

Well, you see Kif, my strategy is so simple an idiot could have devised it: reputation is adjusted by “votes” so that other users can up or downvote another.

Thus solving the problem, once and for all.

patatahooligan@lemmy.world · 1 year ago

I’m assuming this is a joke based on the Futurama references you used, but just to be clear for everyone: this won’t work because it simply moves the problem one step further. How do you prevent bots from upvoting other bots to build a reputation?

Hazzia@discuss.tchncs.de · 1 year ago

It’s okay buddy, I understood your joke.

Thoralf Will@discuss.tchncs.de · 1 year ago

As mentioned: It’s not the silver bullet solution but something that raises the bar for abuse. The reputational score is build up over time on the specific server based on the up- and downvotes you received.

So, yes, this can be abused itself as well - but it requires a lot more effort.

gthutbwdy@lemmy.sdf.org · 1 year ago

I think people often forget federation is not a new thing, it’s a first design for internet communication services. Email, which is predating the Internet, is also federated network and most popular widely adopted of them all modes of Internet communication. It also had spam issues and there where many solutions for that case.

The one I liked the most was hashcash, since it requires not trust. It’s the first proof-of-work system and it was an inspiration to blockchains.

SALT@lemmy.my.id · 1 year ago

Now days email spam filter especially proprietary from Google or Verizon yahoo really make indie mail server harder to maintain and always got labeled as spam even with DKIM, dmarc, right spf, and clean reputable public IP

zumi@lemmy.sdf.org · edit-2 1 year ago

I don’t know what the answer is, but I hope it is something more environmentally friendly than burning cash on electricity. I wonder if there could be some way to prove time spent but not CPU.

lasagna@programming.dev · edit-2 1 year ago

Wouldn’t a detection system be way better? I can see a machine learning model handling this rather well. Correlate the main accounts to their upvoters across all their posts and create a flag if it returns positive. It would be more of a mod tool, really.

I have already ran into a very obvious Russian troll factory account and it really drags down the quality of the place. Freedom of speech shouldn’t extend to war criminals and I’d rather leave any clusterfuck that allows it, whether they do it through will or incompetence.