I took a class senior year with Cynthia which tried to cover some of the things that she had discovered. One of most surprising things about her work is that she brings this full algorithmic firepower to problems that I had always considered outside the realm of mathematics - fairness and privacy. Some of her greatest achievements, I think, are figuring out what problems to solve (and of course solving them). She really stood out as one of our best faculty even in a very crowded room.
At least in the US, having a Ph.D. and insisting on being referred to as "Doctor" is considered a real asshole move. Usually one might address the person once as "Dr. Whatever", but they will invariably reply, "Please, call me Jane".
Even referring to someone as "Dr. Whatever" in the third person is pretty unusual, if you've ever met them. If I was to speak of my college professors right now, I certainly wouldn't use "Doctor". Maybe if they were 70 or had a Nobel or something.
(Related: In the movie Avatar, one scientist introduces himself to another scientist as "Doctor Norm Whatever". It absolutely clangs, at least to my ears.)
If you know them sure. But for a professor who I took a class with? I would always call them Professor Whatever if my only interaction with them was in a classroom context.
Maybe a generational thing? Not sure when you went to school, I went to a midsized state school 3 years ago and mostly referred to professors by their first or last name. I think this is more common in CS perhaps? I don't think my physics friends did the same thing as often though.
Related: as an anesthesiology associate professor (M.D.), I always told the residents to call me Joe. However, there were always a couple who either would not or could not do so, and addressed me for the entirety of their three-year residencies as Dr. Stirt. Diff'rent strokes
I think she asked us to call her by her first name at the beginning of the semester and it stuck (but I can't remember and I might be being rude). Honestly I'm as surprised as you are - mulling it over I'm sure I called her Prof. Dwork when talking to her, since that's how I addressed most professors I didn't well personally.
I went to University of Illinois at Chicago (not UIUC or UofC). If I worked with a professor for awhile (not in class more for research oriented things) sometimes it would be appropriate to address them by their first name but I never did.
There were only two professors I was on a first name basis with in college. One was my advisor, I also worked in his lab for two years. The other was another professor in the department. It was different in grad school. It was first names for professors you worked with. At that point, they don't know more than you do about your research.
A U.S. thing I’ve noticed is calling professors “Dr. Smith” when elsewhere (and certainly in Germany) that would almost be an insult, since it’s “Prof. Smith”.
I know a German professor with two Ph.D.s who calls himself "Professor Doktor Doktor So-and-so". He doesn't introduce himself like that, mind you, but he uses it in his .signature file (is it still called that nowadays? the text that's automatically appended to your email).
It's an introduction to differential privacy to an academic audience (i.e. not necessarily computer scientists). I sweeps across a range of surprising real-life privacy attacks that are possible against anonymization approaches that feel good-enough. Really gives you a sense for sort of problem that privacy protection is in today's world of greatly increased data collection and computational power.
As an interesting example of what differential privacy is, consider this excerpt from Wikipedia:
"A simple example, especially developed in the social sciences,[15] is to ask a person to answer the question "Do you own the attribute A?", according to the following procedure:
1. Toss a coin.
2. If heads, then toss the coin again (ignoring the outcome), and answer the question honestly.
3. If tails, then toss the coin again and answer "Yes" if heads, "No" if tails.
(The seemingly redundant extra toss in the first case is needed in situations where just the act of tossing a coin may be observed by others, even if the actual result stays hidden.) The confidentiality then arises from the refutability of the individual responses.
But, overall, these data with many responses are significant, since positive responses are given to a quarter by people who do not have the attribute A and three-quarters by people who actually possess it. Thus, if p is the true proportion of people with A, then we expect to obtain (1/4)(1-p) + (3/4)p = (1/4) + p/2 positive responses. Hence it is possible to estimate p.
In particular, if the attribute A is synonymous with illegal behavior, then answering "Yes" is not incriminating, insofar as the person has a probability of a "Yes" response, whatever it may be."
Not only is this differentially private, it's locally differentially private, which is an even stronger privacy definition. It's "local" because the user adds randomness themselves. Generic differential privacy is a weaker definition because it lets whoever's running the algorithm collect raw data and then add randomness somewhere in the computation pipeline to produce privatized outputs.
This kind of example also predates the definition of differential privacy by about 40 years [1], although the motivation is pretty much the same.
Is anyone aware of any work for how to apply differential privacy to language models?
So the main question I have is let's say I'm working with sensitive data like emails or doctors notes. How can I train an ML model that would still learn something useful without leaking private data.
When I say "leak", an example would be I train an RNN on some company data email data and when I feed the RNN "$AMZN" the network would say SELL.
How can I quantify how much the model has learnt and how much privacy has been leaked.
Check out Han Song's (MIT, most well-known for NN compression) Lab new paper on gradient leaking and follow-up work. https://arxiv.org/abs/1906.08935
I attended a talk by him recently and was very impressed by the work of his lab in this area.
This is not the original Differential Privacy paper, this is an invited talk (I agree with you that it remains a great read). However, the original paper is "Calibrating Noise to Sensitivity in Private Data Analysis"[1] with four co-authors: Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.
More accurately, the typeset output has ”treat like alike” instead of “treat like alike”, which suggests that the TeX input (most likely) had "treat like alike" instead of ``treat like alike?''. Some historical context that led to features like this (or gotchas, today) in TeX and other systems of the time is here: https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html