All posts by student

Celebrity Twitter accounts display ‘bot-like’ behavior – Phys.Org

'Celebrity' Twitter accounts - those with more than 10 million followers - display more bot-like behaviour than users with fewer followers, according to new research.

The researchers, from the University of Cambridge, used data from Twitter to determine whether bots can be accurately detected, how bots behave, and how they impact Twitter activity.

They divided accounts into categories based on total number of followers, and found that accounts with more than 10 million followers tend to retweet at similar rates to bots. In accounts with fewer followers however, bots tend to retweet far more than humans. These celebrity-level accounts also tweet at roughly the same pace as bots with similar follower numbers, whereas in smaller accounts, bots tweet far more than humans. Their results will be presented at the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) in Sydney, Australia.

Bots, like people, can be malicious or benign. The term 'bot' is often associated with spam, offensive content or political infiltration, but many of the most reputable organisations in the world also rely on bots for their social media channels. For example, major news organisations, such as CNN or the BBC, who produce hundreds of pieces of content daily, rely on automation to share the news in the most efficient way. These accounts, while classified as bots, are seen by users as trustworthy sources of information.

"A Twitter user can be a human and still be a spammer, and an account can be operated by a bot and still be benign," said Zafar Gilani, a PhD student at Cambridge's Computer Laboratory, who led the research. "We're interested in seeing how effectively we can detect automated accounts and what effects they have."

Bots have been on Twitter for the majority of the social network's existence - it's been estimated that anywhere between 40 and 60% of all Twitter accounts are bots. Some bots have tens of millions of followers, although the vast majority have less than a thousand - human accounts have a similar distribution.

In order to reliably detect bots, the researchers first used the online tool BotOrNot (since renamed BotOMeter), which is one of the only available online bot detection tools. However, their initial results showed high levels of inaccuracy. BotOrNot showed low precision in detecting bots that had bot-like characteristics in their account name, profile info, content tweeting frequency and especially redirection to external sources. Gilani and his colleagues then decided to take a manual approach to bot detection.

Four undergraduate students were recruited to manually inspect accounts and determine whether they were bots. This was done using a tool that automatically presented Twitter profiles, and allowed the students to classify the profile and make notes. Each account was collectively reviewed before a final decision was reached.

In order to determine whether an account was a bot (or not), the students looked at different characteristics of each account. These included the account creation date, average tweet frequency, content posted, account description, whether the user replies to tweets, likes or favourites received and the follower to friend ratio. A total of 3,535 accounts were analysed: 1,525 were classified as bots and 2010 as humans.

The students showed very high levels of agreement on whether individual accounts were bots. However, they showed significantly lower levels of agreement with the BotOrNot tool.

The bot detection algorithm they subsequently developed achieved roughly 86% accuracy in detecting bots on Twitter. The algorithm uses a type of classifier known as Random Forests, which uses 21 different features to detect bots, and the classifier itself is trained by the original dataset annotated by the human annotators.

The researchers found that bot accounts differ from humans in several key ways. Overall, bot accounts generate more tweets than human accounts. They also retweet far more often, and redirect users to external websites far more frequently than human users. The only exception to this was in accounts with more than 10 million followers, where bots and humans showed far more similarity in terms of the volume of tweets and retweets.

"We think this is probably because bots aren't that good at creating original Twitter content, so they rely a lot more on retweets and redirecting followers to external websites," said Gilani. "While bots are getting more sophisticated all the time, they're still pretty bad at one-on-one Twitter conversations, for instance - most of the time, a conversation with a bot will be mostly gibberish."

Despite the sheer volume of Tweets produced by bots, humans still have better quality and more engaging tweets - tweets by human accounts receive on average 19 times more likes and 10 times more retweets than tweets by bot accounts. Bots also spend less time liking other users' tweets.

"Many people tend to think that bots are nefarious or evil, but that's not true," said Gilani. "They can be anything, just like a person. Some of them aren't exactly legal or moral, but many of them are completely harmless. What I'm doing next is modelling the social cost of these bots - how are they changing the nature and quality of conversations online? What is clear though, is that bots are here to stay."

Explore further: Why was MacronLeaks' influence limited in the French election?

Continue reading here:
Celebrity Twitter accounts display 'bot-like' behavior - Phys.Org

Here’s How Pheromones Are Driving Your Sex Life – The Alternative Daily (blog)

Cupids arrow has long symbolized the mysteries of sexual attraction. But what factors really drive romantic interest? Scientists speculate that airborne chemical signals known as pheromones may explain the biochemistry of love and lust.

The existence of human pheromones remains controversial. Its clear that many plants and animals species use hormonal secretions to communicate information relating to reproduction. For example, in 1959 researchers discovered that female silkworms secreted a powerful aphrodisiac, called bombykol, that can attract male silkworms from miles away. To date, however, ironclad evidence that human behavior is governed by pheromones remains elusive.

Nevertheless, there are a number of intriguing studies, which suggest the surprising ways that scents, secretions and body odors containing pheromones may influence human behavior unconsciously.

According to Bettina Pause, a psychologist, Weve just started to understand that there is communication below the level of consciousness. My guess is that a lot of our communication is influenced by chemosignals.

Scientists explain that pheromones in animals are released in sweat, urine and saliva. These chemical messengers appear to have both an emotional and physical effect on other members of their species.

In mammals, for instance, pheromones are detected by a structure in the nose called the vomeronasal organ, which relays signals to the hypothalamus a region of the brain that controls emotional states, hormonal regulation and sexual arousal.

Some of the most important evidence for the existence of human pheromones comes from a 1998 study by Dr. Martha McClintock, who found that women who live in close proximity (the same dorm, for example) tend to have synchronized menstrual cycles. Scientists believe that chemical messages in sweat are responsible for this harmonization of periods.

One powerful form of evidence that pheromones exist comes from PET scanning technology, which can examine the effect of chemical odors on male and female brains. In one study, researchers found that certain hormone-like smells activated specific areas in the hypothalamus related to sexuality, which are not triggered by other odors.

In the words of Dr. David Berliner, These findings corroborate that human pheromones do exist, and that women can communicate chemically with men and vice versa. This is a very important finding because it shows specific areas of the brain that are activated by these chemicals.

As you might expect, the brains of heterosexual men and women respond very differently to specific chemical messengers. For example, the brain regions in the female hypothalamus are highly active when women are exposed to testosterone-like chemicals (while exposure to estrogen-like messengers has no effect). Conversely, the brain areas in the male hypothalami light up like a Christmas tree when men are exposed to estrogen-like hormones.

Scientists believe this gender-specific response to chemical secretions shapes the way men and women to perceive each other on an unconscious level.

If pheromones govern sexual arousal, then can they be harnessed to make people more attractive? More specifically, could pheromones be added to perfumes, which could be used to lure desired mates?

One study from the University of Chicago found that pheromone-type chemical can heighten the heart rate, increase body temperature and change mood. As of yet, however, scientists have been unable to isolate the specific chemicals that trigger attraction and sexual desire.

Of course, many perfume manufacturers claim that their fragrances can spark desire. In fact, most of these products contain pheromones from animals. However, most scientists insist that pheromones are species specific. In other words, until researchers can isolate specific human pheromones or develop synthetic analogs, then a true love potion of love will remain elusive.

Nevertheless, scientists are continuing to investigate pheromones for their scientific, commercial and therapeutic potential. For example, a company called Pherin Pharmaceuticals is looking into ways to use pheromones messengers to alleviate stress, anxiety and menstrual cramps.

The science of pheromones is still very unsettled. However, lets look at some ways researchers believe these chemical signals may be influencing you and driving your sex life:

Research by Wysocki and others indicates that women prefer the musky scent of men who happen to have gene characteristics that match up well with their own DNA. In other words, the nose knows. That is, odor prints may be a huge driver of attractiveness in so far as they help people pick mates with DNA that complements their own. This unconscious form of selection benefits offspring.

Scientists are still a long way off from unraveling the mysteries of attraction and the role that pheromones may play in influencing sexual behavior. For centuries, people have used expressions like love is in the air and love is a matter of chemistry. The emerging science of pheromones suggests that these proverbial adages may be far truer than anyone imagined.

Scott OReilly

Read more:
Here's How Pheromones Are Driving Your Sex Life - The Alternative Daily (blog)

The anatomy of a Labour leadership spill: Little exits stage Left pursued by the polls – Stuff.co.nz

VERNON SMALL

Last updated18:27, August 1 2017

STUFF

Andrew Little steps down as Labour leader and is replaced by his former deputy Jacinda Ardern.

It was early Tuesday when Andrew Little decided finally to call it quits.

In Auckland,after launching Labour's East Coast Bays campaign, he had taken the night to mull it over and some doubts may have lingered.

But faced with a round of early morning media interviews, Little realised he had to make a call. He told his staff to cancel his morning slots and headed for the airport and a flightto Wellington.

It was the point his closest advisers knew the end had come.

READ MORE:*Live: Little gone, Ardern to lead Labour*What are Labour's potential new leaders Jacinda Ardern and Kelvin Davis all about* Switching leaders at this point is not easy but if Andrew Little is going, it should be quick*Labour pins its hopes on a new top team that is a stark contrast to National's*Can the Ardern factor save Labour?

But atParliament his colleagues were uncertain. Manyexpectedhim to stay.

Then as Little came through the terminal in Wellington he fended off a reporter's question by denying he was quitting - spreading confusion throughout his staff and MPs and leaving the media trying to reconcile completely contradictory stories.

HANNAH PETERS/GETTY IMAGES

Andrew Little spoke at Labour's East Coast Bays campaign before retreating to his Auckland hotel room to ponder his future as leader on Monday night.

SEEDS OF A SPILL

But the seeds of the spill that saw Jacinda Ardern handed the Labour leadership started in earnestjust a week before.

There had been rumblings in Tuesday's caucusmeetingwith MPs detecting asharp softening in support in voter land.

In part that was being driven by votes switching to the Greens in sympathy with co-leader Metiria Turei's"benefit bomb" - her admission she had lied to get a higher benefit back in the 1990s.

But MPs could feel the squeeze coming on from the other end of the spectrum as Winston Peters barn stormed around the provinces and- like the Greens - delivereda message of radical change that Labour's more conservative and centrist plan was not matching.

But there was more. Andrew Little was not energising voters. And when it came to a stark contrast with PrimeMinister Bill English ... well it just wasn't happening.

And then came the bombshell thatmoved a leadership change from beyond the pale to an odds-on possibility. There had been bad polls before but Labour's own pollsters UMR Research reported a dive in support to an all time low of 23 per cent- worse even than Labour's David Cunliffe-led disaster of 2014.

Over just a month the party's support had racheted down six percentage points from 29 per cent - and was not showing any sign of bottoming. Little consulted leading MPs to ask if he should go and was assured of their support. His deputy - and logical replacement- Jacinda Ardern was staying loyal and refusing to countenance a "Plan B"that would send Little down the road.

At 23 per cent, Little's own re-election would be in doubt- and how can you campaign as an alternative PM when you may not evenbe in the House? Other senior MPs would alsobe out, and the plan to rejuvenate the caucus with a raft of new and diverse faces would be in tatters.

TIME TO DRAW BREATH

It was then just a question of drawing a breath and waiting.

If the public polls were not as dire, then maybe Labour could avoid a meltdown and struggle through the next eight weeks - and hope for a miracle Labour-Green-NZ FirstGovernment after September 23.

But when the One News-Colmar Brunton poll landed on Sunday, confirming Labour at 24 per cent, things began to move fast, accelerated by Little's interview about the poll.

Conceding he had offered to resign was bad enough. It strobed weakness.

But he also conceded he could not credibly lead a Government atthat 24 per cent level ofsupport.

It was a "can't do" moment and a kick in the teeth for the party's hopes and morale.

By Monday, the drums were starting to beat loudly, with a third poll from Newshubmerely confirming the 23-24 per cent polling range.

The internal message was clear; Little would not be ousted by force, but perhaps he could be persuaded to go?

BAD OR WORSE AHEAD?

Hewas coming to his own conclusion. The party's campaign launch was just over two weeks away, another UMR poll could be as bad or worse, and timewas running out.

A full-on coup, opposed by Little and his lieutenants would be too divisive and could stir the ire of the unions and membership, who normally play a pivotal part in leadership selections. But they are out of the mix within three months of an election when all thepower to hire and fire the boss is in the hands of the MPs.

Party insiders deny the "numbers" were being done in the classic sense of a coup. But chief whip Kris Faafoi was ringing MPs to take the pulse anda Jacinda Ardern-Kelvin Davis ticket was being floated.

Meanwhile, Little was told to sleep on his final callin his Auckland hotel. The effective deadline was Tuesday's caucus meeting.

Back among his MPs, the sense on Monday night was that it was a line call. Even his closest allies in the unions and parties were saying it was impossible to read how he would jump. And MP Stuart Nash had nailed his colours to Little's mast, saying the party would be doomed if there was a leadership spill.

But by Tuesday morning - and without Little showing his hand - the pendulum seemed to swing towards him toughing it out.

Senior MPs at breakfast time were sure he was "going nowhere". Theclear denial he was quitting, made when he was door-stoppedat Wellington airport by RNZ reporter Mei Heron, just added fuel to the rumour.

CONFUSION SPREADS

It fed a growing sense of confusion as reporters gathered for Littleat a 10am press conference. One minute Labour insiders were on song with the signalling all morning - he was staying.

Then, just minutes before he arrived, the word rippled around Labour's office suite in Parliament Buildings.

He was going, and Monday night's Ardern-Davis "ticket" would be in charge by lunchtime.

One insider said Little had been confident as he headed to Parliament he had the "numbers" but at some point had realised he didn't.

There might be no coup, but he could not go on as leaderfaced with that reality.

But in truth Little had made his call already. And that was to Ardern from his last ride in a Crown car, to tell her "the worst job in politics" was all hers.

-Stuff

See more here:
The anatomy of a Labour leadership spill: Little exits stage Left pursued by the polls - Stuff.co.nz

Stop saying Facebook’s bots ‘invented’ a new language – Mashable

Image: Shutterstock / Zapp2Photo

Tesla CEO Elon Musk made headlines last week when he tweeted about his frustrations that Mark Zuckerberg, ever the optimist, doesn't fully understand the potential danger posed by artificial intelligence.

So when media outlets began breathlessly re-reporting a weeks-old story that Facebook's AI-trained chatbots "invented" their own language, it's not surprising the story caught more attention than it did the first time around.

Understandable, perhaps, but it's exactly the wrong thing to be focusing on. The fact that Facebook's bots "invented" a new way to communicate wasn't even the most shocking part of the research to begin with.

A bit of background: Facebook's AI researchers published a paper back in June, detailing their efforts to teach chatbots to negotiate like humans. Their intention was to train the bots not just to imitate human interactions, but to actually act like humans.

You can read all about the finer points of how this went down over on Facebook's blog post about the project, but the bottom line is that their efforts were far more successful than they anticipated. Not only did the bots learn to act like humans, actual humans were apparently unable to discern the difference between bots and humans.

At one point in the process though, the bots' communication style went a little off the rails.

Facebook's researchers trained the bots so they would learn to negotiate in the most effective way possible, but they didn't tell the bots they had to follow the rules of English grammar and syntax. Because of this, the bots began communicating in a nonsensical way saying things like "I can can I I everything else," Fast Company reported in the now highly cited story detailing the unexpected outcome.

This, obviously, wasn't Facebook's intention since their ultimate goal is to use their learnings to improve chatbots that will eventually interact with humans, which, you know, communicate in plain English. So they adjusted their algorithms to "produce humanlike language" instead.

That's it.

So while the bots did teach themselves to communicate in a way that didn't make sense to their human trainers, it's hardly the doomsday scenario so many are seemingly implying. Moreover, as others have pointed out, this kind of thing happens in AI research all the time. Remember when an AI researcher tried to train a neural network to invent new names for paint colors and it went hilariously wrong? Yeah, it's because English is difficult not because we're on the verge of some creepy singularity, no matter what Musk says.

In any case, the obsession with bots "inventing a new language" misses the most notable part of the research in the first place: that the bots, when taught to behave like humans, learned to lie even though the researchers didn't train them to use that negotiating tactic.

Whether that says more about human behavior (and how comfortable we are with lying), or the state of AI, well, you can decide. But it's worth thinking about a lot more than why the bots didn't understand all the nuances of English grammar in the first place.

Read the original post:
Stop saying Facebook's bots 'invented' a new language - Mashable

What a nerdy debate about p-values shows about science and how to fix it – Vox

Theres a huge debate going on in social science right now. The question is simple, and strikes near the heart of all research: What counts as solid evidence?

The answer matters because many disciplines are currently in the midst of a replication crisis where even textbook studies arent holding up against rigorous retesting. The list includes: ego depletion, the idea that willpower is a finite resource; the facial feedback hypothesis, which suggested if we activate muscles used in smiling, we become happier; and many, many more.

Scientists are now figuring out how to right the ship, to ensure scientific studies published today wont be laughed at in a few years.

One of the thorniest issues on this question is statistical significance. Its one of the most influential metrics to determine whether a result is published in a scientific journal.

Most casual readers of scientific research know that for results to be declared statistically significant, they need to pass a simple test. The answer to this test is called a p-value. And if your p-value is less than .05 bingo! you got yourself a statistically significant result.

Now a group of 72 prominent statisticians, psychologists, economists, biomedical researchers, and others want to disrupt the status quo. A forthcoming paper in the journal Nature Human Behavior argues that results should only be deemed statistically significant if they pass a higher threshold.

We propose a change to P< 0.005, the authors write. This simple step would immediately improve the reproducibility of scientific research in many fields.

This may sound nerdy, but its important. If the change is accepted, the hope is that fewer false positives will corrupt the scientific literature. Its become too easy using shady techniques known as p-hacking, and outcome switching to find some publishable result that reaches the .05 significance level.

Theres a major problem using p-values the way we have been using them, says John Ioannidis, a Stanford professor of health research and one of the authors of the paper. Its causing a flood of misleading claims in the literature.

Dont be mistaken: This proposal wont solve all the problems in science. I see it as a dam to contain the flood until we make sure we have the more permanent fixes, Ioannidis says. He calls it a quick fix. Though not everyone agrees its the best course of action.

At best, the proposal is an easy change to implement to protect academic literature from faulty change. At worst, its a patronizing decree that avoids addressing the real problem at the heart of sciences woes.

There is a lot to unpack and understand here. So were going to take it slow.

Even the simplest definitions of p-values tend to get complicated. So bear with me as I break it down.

When researchers calculate a p-value, theyre putting to the test whats known as the null hypothesis. First thing to know: This is not a test of the question the experimenter most desperately wants to answer.

Lets say the experimenter really wants to know if eating one bar of chocolate a day leads to weight loss. To test that, they assign 50 participants to eat one bar of chocolate a day. Another 50 are commanded to abstain from the delicious stuff. Both groups are weighed before the experiment, and then after, and their average weight change is compared.

The null hypothesis is the devils advocate argument. It states: There is no difference in the weight loss of the chocolate eaters versus the chocolate abstainers.

Rejecting the null is a major hurdle scientists need to clear to prove their theory. If the null stands, it means they havent eliminated a major alternative explanation for their results. And what is science if not a process of narrowing down explanations?

So how do they rule out the null? They calculate some statistics.

The researcher basically asks: How ridiculous would it be to believe the null hypothesis is true answer, given the results were seeing?

Rejecting the null is kind of like the innocent until proven guilty principle in court cases, Regina Nuzzo, a mathematics professor at Gallaudet University, explains. In court, you start off with the assumption that the defendant is innocent. Then you start looking at the evidence: the bloody knife with his fingerprints on it, his history of violence, eyewitness accounts. As the evidence mounts, that presumption of innocence starts to look naive. At a certain point, jurors get the feeling, beyond a reasonable doubt, that the defendant is not innocent.

Null hypothesis testing follows a similar logic: If there are huge and consistent weight differences between the chocolate eaters and chocolate abstainers, the null hypothesis that there are no weight differences starts to look silly. And you can reject it.

You are correct!

Rejecting the null hypothesis is indirect evidence of an experimental hypothesis. It says nothing about whether your scientific conclusion is correct.

Sure, the chocolate eaters may lose some weight. But is it the because of the chocolate? Maybe. Or maybe they felt extra guilty eating candy every day, and they knew they were going to be weighed by strangers wearing lab coats (weird!), so they skimped on other meals.

Rejecting the null doesnt tell you anything about the mechanism by which chocolate causes weight loss. It doesnt tell you if the experiment is well designed, or well controlled for, or if the results have been cherry-picked.

It just helps you understand how rare the results are.

But and this is a tricky, tricky point its not how rare the results of your experiment are. Its how rare the results would be in the world where the null hypothesis is true. That is, its how rare the results would be if nothing in your experiment worked, and the difference in weight was due to random chance alone.

Heres where the p-value comes in: The p-value quantifies this rareness. It tells you how often youd see the numerical results of an experiment or even more extreme results if the null hypothesis is true and theres no difference between the groups.

If the p-value is very small, it means the numbers would rarely (but not never!) occur by chance alone. And so, when the p is small, researchers start to think the null hypothesis looks improbable. And they take a leap to conclude their [experimental] data are pretty unlikely to be due to random chance, Nuzzo explains.

And heres another tricky point: Researchers can never completely rule out the null (just like jurors are not firsthand witnesses to a crime). So scientists instead pick a threshold where they feel pretty confident that they reject the null. Thats now set at less than .05.

Ideally, a p of .05 means if you ran the experiment 100 times again, assuming the null hypothesis is true youd see these same numbers (or more extreme results) five times.

And one last, super-thorny concept that almost everyone gets wrong: A p<.05 does not mean theres less than a 5 percent chance your experimental results are due to random chance. It does not mean theres only a 5 percent chance youve landed on a false positive. Nope. Not at all.

Again: A p of .05 means theres a less than 5 percent chance that in the world where the null hypothesis is true, the results youre seeing would be due to random chance. This sounds nitpicky, but its critical. Its is the misunderstanding that leads people to be unduly confident in p-values. The false-positive rate for experiments at p=.05 can be much, much higher than 5 percent.

Okay. Still with me? Its okay if you need to take a break. Grab a soda. Catch up with Mom. Shes wondering why you havent called in a while. Tell her about your summer plans.

Because now were going to dive into...

Generally, p-values should not be used to make conclusions, but rather to identify possibilities like a sniff test, Rebecca Goldin, the director for Stats.org and a math professor at George Mason University, explains in an email.

And for a long while, a sniff of p thats less than .05 smelled pretty good. But over the past several years, researchers and statisticians have realized that a p<.05 is not as strong of evidence as they once thought.

And to be sure, evidence for this is abundant.

Heres the most obvious, easy-to-understand piece of evidence: Many papers that have used the .05 significance threshold have not replicated with more methodologically rigorous designs.

A famous 2015 paper in Science attempted to replicate 100 findings published in a prominent psychological journal. Only 39 percent passed. Other disciplines have fared somewhat better. A similar replication effort in economic papers found 60 percent of findings replicated. Theres a reproducibility crisis in biomedicine too, but it hasnt been so specifically quantified.

The 2015 Science paper on psych studies offered some clues to which papers were more likely to replicate. Studies that yielded highly significant results (less than p=.01) are more likely to reproduce than those that are just barely significant at the .05 level.

Reporting effects that really arent there undermine the credibility of science, says Valen Johnson, a co-author of the Nature Human Behavior proposal who heads the statistics department at Texas A&M. Its important that science adopt these higher standards, before they claim they have made a discovery.

Elsewhere, researchers find evidence of an epidemic of statistical significance. Practically everything that you read in a published paper has a nominally statistically significant result, say Ioannidis. The large majority of these p-values of less than .05 do not correspond to some true effect.

For a long while, scientists thought p<.05 represented something rare. New work in statistics shows that its not.

In a 2013 PNAS paper, Johnson used more advanced statistical techniques to test the assumption researchers commonly make: that a p of .05 means theres a 5 percent chance the null hypothesis is true. His analysis revealed that it didnt. In fact theres a 25 percent to 30 percent chance the null hypothesis is true when the p-value is 05, Johnson said.

Remember: The p-value is supposed to assure researchers that their results are rare. Twenty-five percent is not rare.

For another way to think about all this, lets flip the question around: What if instead of assuming the null hypothesis is true, lets assume an experimental hypothesis is true?

Scientists and statisticians have shown that if assuming experimental hypotheses are true, it should actually be somewhat uncommon for studies to keep churning out p-values of around .05. More often, assuming an effect is true, the p-value should come in lower.

Psychology PhD student Kristoffer Magnusson has designed a pretty cool interactive calculator that estimates the probability of obtaining a range of p-values for any given true difference between groups. I used it to create the following scenario.

Lets say theres a study where the actual difference between two groups is equal to half a standard deviation. (Yes, this is a nerdy way of putting it. But think of it like this: It means 69 percent of those in the experimental group show results higher than the mean of the control group. Researchers call this a medium-sized effect.) And lets say there are 50 people each in the experimental group and the control group.

In this scenario, you should only be able to obtain a p-value between .03 and .05 around 7.62 percent of the time.

If you ran this experiment over and over and over again, youd actually expect to see a lot more p-values with a much lower number. Thats what the following chart shows. The x-axis are the specific p-values, and the y-axis is the frequency youd find them repeating this experiment. Look how many p-values youd find below .001.

(And from this chart youll see: Yes, you can obtain a p-value of greater than .05 while your experimental hypothesis being true. It just shouldnt happen as often. In this case, around 9.84 percent of all p-values should fall between .05 and .1.)

This is a specific, hypothetical scenario. But in general, its weird when so many p-values in the published literature dont match this distribution. Sure, a few studies on a question should get a p-value of .05. But more should find lower numbers.

The biggest change the paper is advocating for is rhetorical: Results that currently meet the .05 level will be called suggestive, and those that reach the stricter standard of .005 will be called statistically significant.

Journals can still publish weak (and of course null) results just like they always could, says Simine Vazire, a personality psychologist who edits Social Psychological and Personality Science (though is not speaking on the behalf of the journal). The language tweak will hopefully trickle down to press releases and news reports, which might avoid buzzwords such as breakthroughs.

The change, Vazire says, should make it so that authors need stronger results before they can make strong claims. That's all.

Historians of science are always quick to point out that Ronald Fisher, the UK statistician who invented the p-value, never intended it to be the final word on scientific evidence. That statistical significance means the hypothesis is worthy of a follow-up investigation. In a way, were proposing to returning to his original vision of what statistical significance means, Daniel Benjamin, a behavioral economist at the University of California and the lead author of the proposal, says.

If labs do want to publish statistically significant results, its going to be much harder.

Most concretely, it mean labs will need to increase the number of participants in their studies by 70 percent. The change essentially requires six times stronger evidence, Benjamin says.

The increased burden of proof the proposal authors hope would nudge labs into adopting other practices science reformers have been calling for, such as sharing data with other labs to reach consensus conclusion and thinking more long-term about their work. Perhaps their first experiment doesnt reach this new threshold. But a second experiment might. The higher threshold encourages labs to reproduce their own work before submitting to a publication.

The proposal has critics. One of them is Daniel Lakens, a psychologist at Eindhoven University of Technology in the Netherlands, who is currently organizing a rebuttal paper with dozens of authors.

Mainly, he says the significance proposal might work to stifle scientific progress.

A good metaphor is driving a car and setting a maximum speed, Lakens says. You can set the maximum speed in your country to 20 miles an hour, and no one is going to get killed. You hit someone, they wont die. So thats pretty good, right? But we dont do this. We set the maximum speed a little higher, because then we actually get somewhere a little bit quicker. ... The same is for science.

Ideally, Lakens says, the level of statistical significance needed to prove a hypothesis depends on how outlandish the hypothesis is.

Yes, youd want a very low p-value in a study that claims mental telepathy is possible. But do you need such an extreme level testing out a well-worn idea? The high standards could impede young PhDs with low budgets from testing out their ideas.

Again, a p-value of .05 doesnt necessarily mean the experiment will be a false positive. A good researcher would know how to follow up and suss out the truth.

Another critique of the proposal: It keeps scientific communities fixated on p-values, which, as discussed in the sections above, dont really tell you much about the merits of a hypothesis.

There are better, more nuanced approaches to evaluating science.

Such as:

Ioannidis admits that statistical significance [alone] doesnt convey much about the meaning, the importance, the clinical value, utility [of research].

Ideally, he says, scientists would retrain themselves not to rely on null-hypothesis testing. But we dont live in the ideal world. In the real world, p-values are a quick and easy tool any scientist can easily use to run their tests. And in our real world, p-values still carry a lot of weight into saying what gets published.

With the proposal, you dont need to train all these millions of people in heavy statistics, Ioannidis says. And it would work. It would help.

Redefining statistical significance is not an ideal solution to the problem of replication. Its a solution that nudges people to adopt the ideal solution.

Though no one I spoke to said it directly, I wouldnt be surprised if some scientists find that a bit patronizing. Why couldnt they learn advanced statistics? Or come to appreciate more nuanced way of evaluating results?

Theres a critique of the proposal the authors whom I spoke to agree completely with: Changing the definition of statistical significance doesnt address the real problem. And the real problem is the culture of science.

In 2016, Vox sent out a survey to more than 200 scientists, asking, If you could change one thing about how science works today, what would it be and why? One of the clear themes in the responses: The institutions of science need to get better at rewarding failure.

One young scientist told us: "I feel torn between asking questions that I know will lead to statistical significance and asking questions that matter.

The biggest problem in science isnt statistical significance. Its the culture. She felt torn because young scientists need publications to get jobs. Under the status quo, in order to get publications, you need statistically significant results. Statistical significance alone didnt lead to the replication crisis. The institutions of science incentivized the behaviors that allowed it to fester.

Keep in mind, this is all just a proposal, something to spark debate. To my knowledge, journals are not rushing to change their editorial standards overnight.

This will continue to be debated.

But if it becomes that case where its still hard to publish suggestive results, and if its still difficult to secure grant money off suggestive results, then the institutions of science will not have learned their lesson. Yes, a lot of this is just tweaking the language of how we talk about science. But we have to make words suggestive and null results matter.

Failures, on average, are more valuable than positive studies, Ioannidis says.

Scientific institutions and journals know this. They dont always act like they do.

See original here:
What a nerdy debate about p-values shows about science and how to fix it - Vox

Antisocial bees share genetic profile with people with autism – Science Magazine

Honey bee workers tending an egg-laying queen.

Zachary Huang, beetography.com and Michigan State University

By Elizabeth PennisiJul. 31, 2017 , 3:00 PM

Most honey bees are as busy as, well, a bee, tending to the queen and her young, guarding the hive, and generally buzzing and flitting around in near constant motion. But some bees just sit around and rarely interact with their comrades. A new study reveals that these antisocial insects share a genetic profile with people who have autism spectrum disorders, which can affect how well they respond to social situations.

The work speaks to how evolution may tap the same molecular pathways in very different animals, even for traits as complex as social behavior, says Hans Hofmann, an evolutionary neuroscientist at the University of Texas in Austin who was not involved with the study. The neural circuits underlying social behavior must be very different for humans and honey bees, yet it appears at the molecular level, the genes are employed in a similar manner, he says. Thats kind of striking.

To look for variation in honey bee social behavior, Hagai Shpigler, a postdoctoral fellow at the University of Illinois (UI) in Urbana,designed two tests where he and colleagues video recorded a group of bees and analyzed each individuals reaction to a social situation. In one test, he stuck an unfamiliar bee in with the group. Bees instinctively guard and typically react by mobbing the stranger and sometimes harming it. In the second test, Shpigler put an immature queen larva in with the group. Queen larvae bring out mothering instincts, and worker bees tend to feed the larva. He subjected 245 groups of bees from seven different colonies, 10 bees per group, to these tests multiple times, then ranked how eagerly the bees responded to these situations.

Most bees reacted to at least one situation, but about 14% were unresponsive to both, he and his colleagues report today in the Proceedings of the National Academy of Sciences. The team sacrificed some of the bees and isolated the genes active in the insects mushroom bodies, a part of the brain responsible for complex actions such as social behavior. They found a distinctive subset of genes was active in the nonresponsive bees. Then they compared that set of genes to sets of genes implicated in autism spectrum disorder, schizophrenia, and depression. Even though bees and people are very different evolutionarily, they have many genes in common.

There was a good match only between the gene activity of the nonresponsive bees and genes associated with autism, the team reports. Some of the genes involved help regulate the flow of ions in and out of the cells, particularly nerve cells; others code for so-called heat shock proteins that are typically induced during stress.

The researchers dont yet know how exactly these genes influence social behavior in either bees or people, but manipulating the genes in honey bees may shed light on what they do in humans, says Alan Packer, a geneticist at the Simons Foundation in New York City, which funds autism research, including this bee work. Packer was not involved with this project but has been compiling a list of genes implicated in autism spectrum disorders.

Claire Rittschof, an entomologist at the University of Kentucky in Lexington who was not involved with the work, cautions that the nonresponsive bees might prove to be responsive in a different social situation. Its difficult to separate social responsiveness from behavioral variation in general, she notes. But shes fascinated by the idea that similar genes shape social behavior in different species.

No one is drawing exact parallels between honey bee and human behaviors, Packer notes. We do not want to give the impression that bees are little people or humans are big bees, says team leader Gene Robinson, a behavioral genomicist and director of the UI Carl R. Woese Institute for Genomic Biology. But, says Packer, if you want to understand how these genes interact, the honey bee might be a useful model. Hes eager to know whether this same set of genes is involved in social responsiveness of other animals. The more models that are available to study how these genes give rise to these behaviors, the better.

Its not clear why these asocial bees are tolerated by the rest of the hive. Rittschof thinks these individuals are considered part of the group despite their unusual behavior. Both human and bee societies contain and accommodate a range of different personality types, strengths, and weaknesses, she suggests.

Read this article:
Antisocial bees share genetic profile with people with autism - Science Magazine

Using Numbers To Comprehend And Control Human Behavior – NPR – NPR

Since the Enlightenment, champions of progress have urged us to break free of the chains of tradition.

Just because "we've always done it this way," is no reason to keep doing it this way. It is irrational, it is dumb, indeed, it is frequently dishonest, to cling to traditions, they say. If we aim to understand the world and control it the abiding ambition of all empirically minded thinkers then surely we can dispense with the baggage of inherited convention.

Keith Law has just published a book that explores this question. The book is opinionated and it is sparked by fury. Indeed, Law writes as one who speaks truth to power. It is written by someone who thinks of himself as at the vanguard, the revolutionary forefront.

It is now possible, he insists, indeed, it is now mandatory, that we use mathematical analysis and statistics not only to evaluate human achievement, but also to learn how to predict it in the future.

I exaggerate maybe just a little bit.

Law's book is about the use of statistics in baseball. And while his assault on the Old Ways is driven by a real sense of outrage at the way irrational tradition shackles progressive thinking, he confines himself, by and large, to bad thinking in the domain of baseball. It is baseball he wants us to learn to think right about.

The Story Behind the Old Stats That Are Ruining the Game, the New Ones That Are Running It, and the Right Way to Think About Baseball

by Keith Law

Law is a writer at ESPN and his his book, published in April, is called Smart Baseball: The Story Behind The Old Stats That Are Ruining The Game, The New Ones That Are Running It, And The Right Way To Think About Baseball.

For Law, the "old stats" are ruining the game. Batting Average, for example, is a terrible measure of a batter's offensive "value," since it considers hits-per-at-bat. This is doubly wrong-headed, he contends: It ignores the fact that not all hits are created equally (a home run is worth more than a single), and it disregards the batter's offensive achievements (e.g. walks), which don't happen during at bats (since not all plate appearances count as at bats). Likewise, Runs Batted In is not only uninformative about how good a player is offensively, it is dishonest, for it confuses his accomplishments with those of his teammates, Law says. You can only drive batters in, after all, if there are runners on base to be driven in.

Or consider the evaluation of pitching performance by wins; this is even more outrageous, he says. You can only win if your team scores, and the pitcher has no control over that. The idea that it is the pitcher who wins is premised on the idea that good pitchers have a kind of magic that leads their teams to victory. And that, Law is certain, is so much nonsense. Praising an individual player for results over which he has nothing resembling control isn't very bright. It isn't going to help you figure out what's really going on on the field, and might very well lead you to make bad baseball decisions.

We use statistics, Law holds, to evaluate performance. We want to understand what a player actually does on the field, and we want to predict likely performance going forward. We need objectivity to do this. We need data. We need metrics that cut through the noise to the reality. The last thing we need are old fashioned prejudices about pitchers winning games and RBI being a measure of a player's offensive value to his team, he says.

Can we do what Law and his fellow "quants" demand? Can we use numbers to assign value, to sort through praise and blame, and to ground baseball decisions in matters of value-neutral fact? I get it that this is something baseball executives want. Michael Lewis explained in Moneyball that the new statistics make it possible to discover sources of baseball value that traditional thinking has tended to ignore. And I get it that if you're a player, or a manager, or a fan, the problem of evaluating and predicting is of the greatest importance.

But is it actually possible, in baseball, or in life, so to regiment, comprehend and control human behavior?

I think there are reasons to doubt this.

One of the things that particularly bugs Law about the RBI stat is that there are cases, as he notes, where the official scorer has discretion over whether to award the RBI. He continues:

"[A]ny stat that involves such human objectivity [I think he meant to write "subjectivity"] is immediately reduced in value as a result. People are prone to so many cognitive biases and are so inconsistent in their judgments..."

But in fact, I would argue, all baseball stats rest, finally, on just this sort of subjectivity. Consider, at the lowest level, baseball is about hits and outs. For example, Law argues that the basic job of a batter is to not make an out that is, to get on base.

But are outs determined in a value-free, objective way? Not really. Very frequently, at least, the question of whether an out was made is a judgment call. Instant replay hasn't changed this. It's just removed the required judgment call to a remote location.

And the same is true of hits themselves. When is a hit a hit, and when is it the result of a fielder's error? Nothing determines this other than the decision of the official scorer.

And let's not even get into balls and strikes!

However you look at it, the low-level facts on the ground, the smallest units of meaningful baseball hits, outs, balls, strikes, foul or fair are themselves intrinsically soft, squishy, value-laden matters of interpretation.

Bring the biggest quantificational canon you can find. It won't shoot straight if you set in down on shifting sands.

But maybe this is not a bad thing. Maybe this is what we love about baseball. We are called on to evaluate, to make choices, to make predictions, to lay odds, precisely when there are no algorithms or mathematical rules to do this for us.

I don't advocate a return to tradition. I think Law and his colleagues are right that there is a value in new analytical tools for thinking about baseball. But that's a far cry from accepting his idea that it is possible to use numbers, by themselves, to identify and control value, in baseball, or anywhere else.

Want to know what happened on the field? You'd better take a look.

Alva No is a philosopher at the University of California, Berkeley, where he writes and teaches about perception, consciousness and art. He is the author of several books, including his latest, Strange Tools: Art and Human Nature (Farrar, Straus and Giroux, 2015). You can keep up with more of what Alva is thinking on Facebook and on Twitter: @alvanoe

Continue reading here:
Using Numbers To Comprehend And Control Human Behavior - NPR - NPR

Lab automation and Six Sigma levels: Here’s what we learned – GlobeNewswire (press release)

We found that to achieve this level [Six Sigma], a laboratory needs automation, says Charles Hawker, who helped develop ARUPs highly sophisticated automation system.

Salt Lake City, Utah, July 31, 2017 (GLOBE NEWSWIRE) -- Attaining Six Sigma Levels in the Laboratory: Heres What We Learned

SALT LAKE CITY, July 26, 2017This month, ARUP Laboratories published a report in Journal of Applied Laboratory Medicine (JALM) detailing its 25-year journey toward achievement of a Six Sigma score for lost specimens.

We found that to achieve this level, a laboratory needs automation, says Charles Hawker, PhD, MBA, who coauthored the article in JALM. The Six Sigma quality method seeks to achieve error rates of no more than 3.4 defects per million opportunities.

To my knowledge, ARUP is the first clinical laboratory in the country to achieve Six Sigma quality for any metric, adds Hawker. For nearly two decades, Hawker has helped develop ARUPs highly sophisticated automation system.

While the ultimate goal is perfection, particularly in healthcare, making incremental progress toward this goal is the focus of ARUPs continuous improvement system. In clinical laboratories, mistakes in the analytic area are generally minor contributors to poor laboratory quality and diagnostic error. The majority of mistakesincluding lost or misplaced specimenshappen in the realm of nonanalytic processes.

Some 55,000 specimens, destined for testing in 70 specialized laboratories, are processed daily at ARUP, so tracking the precise location of a single specimen is a herculean task. From time to time, one of these samples may lose its way.

The JALM article homes in on lost-sample solutions that involve automation and human behavior controls, but the corporate culture is another important consideration. Its a patient-centric culture here; each specimen is a patient, says David Rogers, who oversees specimen processing and also coauthored the article.

We want this report to show other laboratories that they too can attain this level of quality, emphasizes Hawker. Readers learn how the automation of nonanalytic processes decreases the number of lost specimens. In addition, the article covers a variety of engineering and behavioral controls, which relate to how humans work, that have played a role in this remarkable achievement.

Every time a human touches a sample it creates an opportunity for error, explains Bonnie Messinger, ARUPs process improvement manager and the articles lead author. She estimates that a specimen could be handled 20 or more times from the point it leaves the client until it is discarded.

Automation Improvements

Using data spanning the 25-year period, the authors show the correlation between lost specimens and the implementation dates for eight major phases of automation, along with 16 process improvements and engineering controls. While implementation of process improvements, engineering controls, and automation all contributed to overall reduction in the lost specimen rate, the data shows that automation was the most significant contributor.

With each automation enhancement, lost specimen rates decreased. It did not happen immediately, but over the succeeding months, each new level of automation led to improvement. Because the automation stages and various process improvements overlapped, it was not possible to look at any particular stage or process enhancement in isolation, but collectively, the various changes have produced a nearly 100-fold improvement in the lost-sample Six Sigma metric.

Error-Proofing and Human Behavior Management

Human behaviors are influenced by process and engineering controls. In collaboration with ARUPs in-house engineering team, zeroing in on relatively small modifications to the work environment proved to be quite effective.

We have 18 different behavioral management strategiesways of encouraging certain behaviors and preventing others, says Messinger. Such changes can be very simple, such as encouraging people to keep their work areas uncluttered or establishing a lost-sample checklist.

Sharing with Others

The article attributes the remarkable decrease in the frequency of lost specimens not to a single intervention, but to a multifaceted, cumulative approach. Our results demonstrate that two approachesautomation and designed behavioral controlsworking together, can yield remarkable results, says Messinger.

The articles coauthors emphasize that even if a laboratory doesnt have the same level of automation as ARUP, any degree of automation that replaces an error-prone process will help reduce error. They also assert that the main purpose of the article is to share stories of success and spread healthcare improvement ideas.

About ARUP Laboratories

ARUP Laboratories is a national reference laboratory with more than 90 medical experts who are available for consultation. These experts are faculty at the University of Utah School of Medicine and many participate in care teams at the Huntsman Cancer Hospital and Primary Childrens Hospital. In addition, ARUP is a worldwide leader in innovative laboratory research and development, led by the efforts of the ARUP Institute for Clinical and Experimental Pathology.

Attachments:

A photo accompanying this announcement is available at http://www.globenewswire.com/NewsRoom/AttachmentNg/14937de1-f9c8-497c-81db-daa8b6efb866

Go here to read the rest:
Lab automation and Six Sigma levels: Here's what we learned - GlobeNewswire (press release)

Is This Dog Dangerous? Shelters Struggle With Live-or-Die Tests – New York Times

The 10- to 20-minute tests, developed by behaviorists and tweaked by practitioners, ask two basic questions: Will the dog attack humans? What about other dogs?

Evaluators may observe the dog react to a large doll (a toddler surrogate); a hooded human, shaking a cane; an unfamiliar leashed dog or a plush toy dog.

But these tests have never been rigorously validated.

Dr. Bennetts 2012 study of 67 pet dogs, which compared results of two behavior tests with owners own reporting, found that in the areas of aggression and fearfulness, the tests showed high percentages of false positives and false negatives. A 2015 study of dog-on-dog aggression testing showed that shelter dogs responded more aggressively to a fake dog than a real one.

Janis Bradley of the National Canine Research Council, co-author with Dr. Patronek of the analysis published last fall, suggested that shelters should instead devote limited resources to observing the many interactions that happen between dogs and people in the daily routine of the shelter.

But Kelley Bollen, a behaviorist and shelter consultant in Northampton, Mass., maintained that a careful evaluation can identify potentially problematic behaviors. Much depends on the assessors skill, she added.

In fact, no qualifications exist for administering evaluations. Interpreting dogs, with their diverse dialects and complex body language wiggling butts, lip-licking, semaphoric ears and tails often becomes subjective.

Indianapolis Animal Care Services, which admitted 8,380 dogs to its municipal shelter in 2016, is often overcrowded and understaffed, yet faces intense scrutiny to save dogs while protecting the public. Last year it euthanized 718 dogs for behavior, based on testing and employee interactions. The agency consulted Dr. Bennett, a shelter specialist, to better manage that difficult balance.

Even as she demonstrated assessments for staff members, Dr. Bennett noted another factor that renders results suspect: the unquantifiable impact of shelter life on dogs.

Dogs thrive on routine and social interaction. The transition to a shelter can be traumatizing, with its cacophony of howls and barking, smells and isolating steel cages. A dog afflicted with kennel stress can swiftly deteriorate: spinning; pacing; jumping like a pogo stick; drooling; and showing a loss of appetite. It may charge barriers, appearing aggressive.

Conversely, some dogs shut down in self-protective, submissive mode, masking what may even be aggressive behavior that only emerges in a safe setting, like a home.

Little dogs can become more snippy. But no matter what evaluations may show, they always seem to get a pass. Ill warn, He nips and snarls, recounted Laura Waddell, a seasoned trainer who does volunteer evaluations for Liberty Humane Society in Jersey City, N.J. And I get back: I dont care! Im in love!

One way to reduce kennel stress, Ms. Sadler, the shelter consultant, said, is through programs like hers, Dogs Playing for Life, which matches dogs for outside playgroups. Shelter directors say it is a more revealing and humane way to evaluate behavior. The approach is used at many large shelters, including in New York City, Phoenix and Los Angeles.

The most disputed of the assessments is the food test. Research has shown that shelter dogs who guard their food bowls, as Bacon did, do not necessarily do so at home.

The exercise purports to evaluate resource guarding how viciously a dog will protect a possession, such as food, toys, people. Common-sense owners wouldnt grab a dogs food while it is eating. But shelters worry about children.

Dr. Bennett suggested that Bacons bite of the fake hand didnt necessitate a draconian outcome. With counseling, she said, a household without youngsters would be fine.

The shelter workers dearly wanted to save Bacon. But they were so overwhelmed that they did not have the capability to match him appropriately and counsel new owners.

So Bacon remained at the shelter for several weeks, waiting. Finally, Lindas Camp K9, an Indiana pet-boarding business that also rescues dogs, took him on. He settled right down and recently was adopted. Linda Candler, the director, placed him in a home without young children, teaching the owners how to feed him so he wouldnt be set up to fail.

His potential made him stand out, Ms. Candler said. Bacon is amazing.

Read the original:
Is This Dog Dangerous? Shelters Struggle With Live-or-Die Tests - New York Times

Metrion Biosciences and LifeArc Announce Collaboration to Support LifeArc’s Neuroscience Programme – Technology Networks

Metrion Biosciences Limited (Metrion), the specialist ion channel CRO and drug discovery company, and LifeArc, the UK medical research charity previously known as MRC Technology, newly announced an extension of their existing partnership, to support LifeArcs neuroscience drug discovery programme.

Under the terms of the agreement Metrion will provide validated ion channel and electrophysiology-based assays and safety profiling services, and LifeArc will conduct medicinal chemistry aimed at identifying novel modulators of an undisclosed CNS ion channel target. In addition, Metrion will contribute translational research expertise to evaluate the activity of LifeArc compounds in human neuronal networks.

Metrion will provide translational assay support by applying its extensive background knowledge in ion channel research, microelectrode array (MEA) technology, and access to its CiPA-compliant cardiac safety assays.

Dr Andrew Southan, Chief Operating Officer, Metrion Biosciences, said: The Metrion team has a long history of developing, validating, and providing specialist ion channel assays to optimise and select development candidate molecules. We believe combining this with translational neuroscience and microelectrode array capability, as we are in this promising project with LifeArc, may be particularly successful in CNS research.

Justin Bryans, Executive Director, Drug Discovery, LifeArc, commented: LifeArc is committed to working with cutting edge organisations such as Metrion, capitalising on our combined expertise and capabilities to advance programmes addressing human health. Our previous experience in working with the team at Metrion has been excellent, and we look forward to continuing the relationship.

This article has been republished frommaterialsprovided by Metrion Biosciences. Note: material may have been edited for length and content. For further information, please contact the cited source.

Read more:
Metrion Biosciences and LifeArc Announce Collaboration to Support LifeArc's Neuroscience Programme - Technology Networks