Comment on The most unethical thing I was asked to build while working at Twitter

There needs to be a concerted effort by the entire privacy community towards data poisoning. Actual privacy is no longer attainable, but everything collected can still be made useless.

A simple thing is to install a browser add-on like TrackMeNot that do random word Search every so often on a list of search engines.

I've got your back mate.

Do this once a week.

Read em and weep: https://blog.mozilla.org/en/firefox/hey-advertisers-track-this/

Straight to the fun part https://trackthis.link/

Not my content. Sharing from a different user

Would it still be useful if I've had uBlock for like ~5 years now? I thought it automatically blocked most of these invasive cookies on its own.

It's up to you to decide if that's something you wish to incorporate. You can see what it blocks and weigh that in your threat model

It would be cool if you could do this with a raspberry pi, sort of like a pihole blocks ads on your network.

“Track This”… made by Firefox????????

Firefox takes privacy very seriously.

Or just use multi-account containers on Firefox and delete cookies. This works extremely well.

FF now does first-party isolation with the default settings -- they branded it Total Cookie Protection. So you shouldn't need to use containers just for site isolation anymore (although it's still useful for keeping the porn account logged in)

Do you know a good add-on that can let you specify what cookies are kept versus deleted when you close FF? I automatically delete all cookies which is a little bit of a pain for some sites. I'd like to specify what is deleted and what isn't.

I think Cookie Autodelete can do this

Thanks! Trying this one out now.

Forget Me Not can do that.

You can set exceptions without any extensions. Bloating browser with extensions make it easier to fingerprint it.

Yeah there's a built in option in settings

Privacy badger, no script, ad nauseum and de centraleyes

Of course on Firefox and if you don't use DRM websites, librewolf

Decentraleyes breaks sites often after you gather enough fonts

Which sites have you encountered this on?

I stopped using it after it started bugging, specifically on hdtoday.tv where the movie thumbnails don't show up (probably happens on other sites with thumbnails/images of the same type), and NexusPipe captchas too

Maybe send that info to the dev I'm sure they'd love to hear from you

Is there anything that prevents apps on your phone tracking your location?

not complete, but you can minimise it. take a look at privacy settings, have GPS off by default and only turn it on when you need it.

Remove the capabilities of Apps to turn on Bluetooth or WiFi whenever they want, and to scan. Also stop using Google locations services.

Turn Location Services off when you don't need it, though it may be useful for tracking your phone/device if you lose it.

And never give a weather app access to your location.

Not much that can be done at present, but one example would be layering every image you post with 20 other transparent images so facial recognition datasets with your face can’t confirm who you are. The biggest problem is adversarial machine learning, because with every move we make AI improves.

Edit - ”Steganography”

There's a tool I saw posted here once, that does that automatically. You give it a picture with people in it and it returns a copy indistinguishable for humans but completely unreadable for facial recognition. I wish I could remember its name.

Edit: Probably Fawkes. If there's another one do let me know!

Looks a lot like it!

If you can find it again let us know!

From u/signal-insect's comment, it was probably Fawkes.

Oh shit this is great 😳 thank you!!

This seems like something that AI can be trained to circumvent.

A circumvention another AI can be trained to circumvent?

Edit: Probably

Fawkes

. If there's another one do let me know!

when I hear the name "Fawkes" I'm reminded of fallout 3

I think you're just spit-balling but the layers thing wouldn't work. Content analysis systems are able to 'see' the image as we do, they would not be aware of any hidden layers. Those would be found by a metadata/Exif/stream parser/demuxer.

Completely spit-balling 😂 sweeping metadata before upload should be standard practice no matter what though (ideally spoofing too) so that was kind of just assumed tbh.

Absolute transparency of the added images would be pointless, I agree, but the thought-process is essentially stacking nearly invisible but still barely perceivable images onto your main image and then taking a screenshot of that and sweeping/spoofing metadata prior to posting.

Do you mind explaining more on how it is that AI can “see” in the same way as humans? Idea here was to play with the limits of human perception and find middle ground where other people don’t notice but the AI can’t figure it out, or even better just ends up identifying the image as something else previously identified and categorizes it with that instead of being flagged for review by an actual analyst. Total shot in the dark though.

The image can be "seen" by using FFT to summarize the content and then use a image classifier (machine learning) to compare samples to known objects or things. This is done by training the model and not actually comparing each one. The model knows what a dog looks like after training.

The AI would be comprised of a few parts. One part is to look at the image like a computer - find all the hidden stuff and format properties. Another part would be the detection or classification algorithm - attempts to 'see' what the image is made of by comparing it as a whole and potentially in parts to known images. This step is done by a machine learning FFT network that has been trained to classify images.

Facebook and Google already run image classifiers on any photos that run through their systems. Here is an image classification from Instagram (the photo is a hand touching a dog wearing sunglasses) "May be an image of one or more people and a dog"

If you're really interested in how the images are processed in the FFT step you can look at this software for an example. https://github.com/qarmin/czkawka It's a duplicate file finder that supports similar videos and images. This means that it can detect different quality levels of the same photo or video. To do this is generates a match score based on the similarities of the FFT processed images. FFT is like a way to summarize data by rounding off the noise.

Thank you 🤙🏻

Tbh still somewhat confused though as to how this would be a bad thing? If I can manage to get AI to identify images of myself with puppies instead of with me then I’d say that’s job done, no? Granted, the second it hits an analyst’s desk that’s game over and time for a pivot, but that’s just the nature of the beast when privacy as a whole is such a cat and mouse game.

Perhaps I just don’t understand enough about the topic yet. Appreciate the reading.

Oh it's not a bad thing. Smart systems are great but their use and purpose needs human care.

My comments were response to the layers being an effective tactic to thwart the AI detection. I wanted to point out that they are not so people don't give away private info on accident thinking it would work.

However.... There is a practice called Steganography which is the embedding of images within images. This is a great video on the topic https://www.youtube.com/watch?v=TWEXCYQKyDc Steganography might be able fly under the AI detection but it would not be used to poison the AI. A bad steg image just looks like two images and the AI would see that as well. A good steg image looks like 1 image and the AI would see the 1 unless... it already knew how to undo the Steganography tactic that was used.

Appreciate the correction! Steganography is exactly what I’m looking for here! Not the same one, but a video I had seen years ago on this same subject is the source of the idea, so I’m glad to know the correct terminology now.

Someone else commented about Fawkes which I’m looking into now, but do you have any thoughts on that?

I need to add a disclaimer to my posts that anything I say should not be taken as advice and should be reviewed by a 3rd party before following. 😂

Fawkes is different and it's designed to target the actual data points that the image models use to classify images. One major data point is distance between the eyes. When Fawkes runs it makes minor changes to these areas to throw off the training or classification of the model. When training a model, any variation in these 'ground truths' would be considered poison to the model.

So Fawkes can change ear height and eye distance by 1 pixel each and maybe the images cannot be classified anymore. This type of obfuscation is very targeted and I would not assume that the model used to defeat one AI is not going to work on them all or even any others.

Imagine the photoshop liquify swirl tool used on a face but in a very subtle way and only affecting the measure points. That's what Fawkes is doing.

From the website

Fawkes cannot:

Protect you already-existent facial recognition models. Instead, Fawkes is designed to poison future facial recognition training datasets.

So they are aware of the FFT step averaging out the subtle changes made by Fawkes. It only works on new data sets because they require "ground truth" to learn from.

Excellent breakdown. I wonder if redundancy would be better or worse here though in combination with Steganography. Makes me think having used Fawkes could easily become an identifier in-and-of itself, no?

Yes actually. Fawkes would have to make different non-repeating changes to the photos or else the AI would build a model of the altered person and it would be able to recognize those fakes.

The AI model doesn't know the real truth, it only knows what we show it and tell it to look for. So it would totally work for detecting fakes as well.

There are tricks we can do to detect the small variations that Fawkes makes but it becomes much harder when only 1 copy of the photo exists. Check this out https://fotoforensics.com/tutorial.php?tt=about

Definitely will check it out! Thanks again for the reading material!

My bad. Steganography is very cool. You can get free software to make them yourself, even on a phone, and then send them around. The key in using it is that your recipient knows what method to use to undo it and get the data out. There are a few methods to achieve steganography

Cue 'Every Step You Take'

I use an extension on Firefox that does it. Not sure how well it works, but it's installed.

Unfortunately most people on r/privacy don't know shit about cybersec or compsci, just judging by how they lose their shit when apps take info (hint! they need that info to fix bugs. !!!)

I used to use a data poisoning plugin called ad nauseum. It's the easiest way to poison trackers, but it didn't really catch much with the other stuff I have set up, so I eventually disabled it.

My current set up is firefox containers, with switch container, and temporary contaners. These containers let me isolate sites, sot hey can't see my cookies from other sites, and things like that. It's like running different sites in different virtual machines. It's also nice to be able to right click > select reddit throwaway account A or main account B or whatever you call your accounts, and you can switch reddit accounts without signing out and in. You can even keep two accounts open simultaneously on two different tabs.

So after that container set up, I have a bunch of privacy plugins (privacy badger, duck duck go, etc), and ad blockers (uBlock, ad blocker, etc), and a cookie remover. I usually only use the cookie remover for bypassing paywalls, but it can be used to confuse trackers, too. On top of that, I use a VPN, so finger printing me with my IP address, or with protocols is tricky. I'm still fingerprintable, but it would take more effort than it's worth to fingerprint me.

best one is https://adnauseam.io/

Actual privacy is no longer attainable

That's a bit too defeatist. It's certainly much harder than it has any right to be and requires far too much attention to compartmentalization, but it's attainable.

Regarding poisoning though, I'm not sure how well it'd work considering the existing relatively noiseless datasets.

Never say never, for sure, but yeah for the average person I think unattainable is a relatively suitable way to describe it.

Regarding smoothing, yeah poisoning would also be pretty hard to pull off too as it would mean needing to obfuscate literally every move you make in an extremely erratic manner.

Personally, I think trying to stay ahead of the curve maintaining privacy on both the hardware as well as software level and poisoning all data as a fail safe is the ideal.

That's an option, but you could also simply heavily compartmentalize your online activities and your offline activities (keeping them at a minimum also helps).

"XYZ works there and does that, constant schedule. Never leaves home otherwise. All other information unknown." (Remote work would make that even more limited).

Unfortunately meatspace is pretty much lost in practical terms.

Yes, but if none of these chores are relevant to anything you actually care about in your life, you still maintain some privacy on those things which you do care about and which are harder to observe (particularly if you make some effort to make it so).

As I described. Every step of something utterly unremarkable and useless, with everything else unknown.

Of course undoing mass surveillance which you describe should still be a priority, but it's easy to re-implement so I'd have some doubts about how long that'd last.

Who is leading the effort?

We each have to lead it ourselves, and audit each other in the process. I don’t have an easy path forward to provide, just an idea to hopefully plant the seed.

It would be great to see everyone stop chasing their tails running privacy software on inherently unsecured hardware which negates everything they’re doing from step 1. For example: Running Tor without neutering Intel Management Engine means you’re not hiding anything and the only thing saving you from a knock on your door by the alphabet boys is due-process and jurisdiction, but everything is still collected/analyzed/profiled/shared.

The average person doesn't care about privacy. Which is weird because we live in our own living spaces. So I really do think the first step is to raise awareness where possible.

JShelter modifies your JavaScript data requests.

Rob Andersen @ Grape ID is leading this effort (me writing this...and I invite anyone to call my bluff). After 6 years of R&D we're finally releasing workable app to both 1) hide your data, and 2) be attractive & usable for everyday people so that it becomes massively adopted (which is the pre-requisite for the right solution to make our data "useless"). Also, we have to further define exactly what data we're referencing.

For example though, I have said on YouTube and in-person to many people that I'll PUBLICLY PUBLISH my SSN, credit card numbers, phone #, etc once our app reaches mass adoption -- I will do this because at that point that specific data will be "useless" and no one will be able to create fake credit accounts, charge my cards, or spam my phone.

Until the right solution reaches mass adoption, the best strategy right now is to HIDE our data using encryption, tokenization, etc. I made another comment below with an example... would love to hear your feedback because you can literally download our app and start posting on social media (and even Reddit soon if we want) in a totally 100% private, encrypted way. You'll see in my other comment. I'm here to help. BTW my app is always free, no "gotchya", and there's a legit business model that doesn't put individuals like us at risk.

You’d have more luck updating the os to not let user space processes be aware of what else is running. They have no business even getting a list of running processes imo.

Degoogled phone, privacy browser, and stop using social media!

Yes and we need many different methods of data poisoning to make it harder to detect.

This is the "Reverse-Huxley Maneuver"

This is the most amazing concept I’ve heard of in at least a decade. Imagine if we all did this…

This is an outstanding suggestion

Doesn’t Brave already assist in this? What are some good Chromium apps?

Comment on The most unethical thing I was asked to build while working at Twitter — @stevekrenzel

Comments ()