Mia Lobel
Home .-=-. Sound .-=-. Photography .-=-. Writing .-=-. Resume .-=-. Link

Newsroom of the Future - Tech Feature

"It's me," says John to the intercom outside his glass-encased office.

"Good morning, John," answers a friendly female voice. The voice is emotionless, accent-less, accommodating.  She sounds like a secretary in tailored suit.  John’s door slides open.

He steps into the soundproof room. The door slides shut. He throws his coat over his wide armed desk chair and puts his coffee beside a flat-screen computer monitor. "You have three new voice mail messages and two saved messages," says the voice. "Would you like me to play them for you?"

"Not right now, thank you," John says. "Get me my e-mail."

"Sure thing. Just a minute."

John sits in front of his monitor and sips his coffee. He looks out his 17th story window to the New City stretching out below him. He can faintly hear the hum of the newsroom server pulsing below his feet.

"You have eight new e-mail messages." His account flickers up on the monitor.

"Scroll down, please." Old messages disappear into the top of the screen. "Stop. Read the message from Jay Thompson."

The computer reads the message in the same friendly tone - ever patient, ever helpful.

"Reply,” says John. “Dear Mr. Thompson. Thank you for your message. I’m looking forward to meeting with you for an interview this afternoon..."

John's words float through the space of his office and appear on the monitor. He finishes composing his message and commands, "Read it back to me, please."

The computer reads the message back, word for word.

"Thank you. Send it. And please call me a taxi."

"No problem." The message flicks out into cyberspace. A faint dial tone and a series of beeps and clicks as the computer accesses the phone database and requests a taxi from another computer halfway across the New City. Fifteen minutes later, the computer interrupts an Internet research session.

"I'm sorry to interrupt, John, but your taxi has arrived."

"Thank you. Take a break."

"Sure. Coffee sounds good right now. Say ‘come back’ when you need me."

John smirks at the occasional humor of his assistant as he puts on his coat and steps out the door that has slid open silently. On his way down the newsroom corridor, he passes other journalists settling in for the morning - their lips moving silently behind the soundproof glass, mouthing commands to their computers who respond obediently - thousands of microcomputers controlled by a central server with access to every resource a journalist could want - phone books, maps, government directories, dictionaries, the AP Stylebook, the Internet - all controlled by the human voice.

This is the newsroom of the future. And it's not as far off as some may think. In fact, the two types of technology needed to run this newsroom already exist – speech recognition and voice interface systems – and it’s only a matter of time before the two come together.

Gordon Moore, Intel co-founder and the creator of Moore’s Law – which has defined the exponential growth of computerized technology since the 60’s – says “really good” speech recognition will be the next most important change in the computer industry. “By ‘really good,’” Moore tells Business 2.0, “I mean enabling the computer to differentiate among ‘to,’ ‘two,’ and ‘too.’ It has to understand in context. I think once the computer gets to that point, it essentially understands language and you can have a conversation with it. To me, that could have a dramatic impact on the way people use computers.”


*     *     *     *     *

Scholle Sawyer McFarland, managing editor of San Francisco-based Macworld magazine, began using speech recognition software four years ago when a repetitive strain injury left her unable to use her arms and hands. She was 26 years old. "I thought I was going to be fired," she said. "I didn't know if I'd ever be able to work again. I couldn't touch a computer. Journalism is all about being able to work hard whenever you're needed. And if you can't do it, somebody else can."

At the time, she was working as a writer for the monthly computer magazine. As one of only two feature writers for the feature-heavy publication, she worked under extreme pressure – 14-hours a day, six days a week. Her body couldn't handle it. The tendons in her shoulders wore down until they could no longer hold her shoulders in their sockets. "It felt like someone had pulled my arm off," she said. "I couldn't even really sit up straight for very long. They called me the brain in the jar."

After five weeks of physical therapy and anti-inflammatory drugs, Sawyer McFarland returned to work. But she could never go back to the work style she had before her injury. She was lucky. Macworld readily adapted her responsibilities to fit her physical limitations. They completely revamped her workstation with a futuristic-looking ergonomic set up including a $400 keyboard and a $300 chair. Macworld even bought her a windows-compatible PC so she could use the latest speech recognition software that had not yet been adapted for Macs. The company spent about $5,000 to get Sawyer McFarland working again. Not including the cost of lost workdays.

"The more and more efficient and the faster and faster the technology has gotten, the worse and worse it's gotten for people," says Sawyer McFarland. There was a point in her career when she would e-mail a coworker sitting not three feet away. "It was like we'd forgotten that we can get up off our butts and just walk over and talk to someone."

But the technology that has caused all these problems is now being adapted to work for those who were harmed by it. And for those who use a keyboard daily, especially journalists working under tight deadlines, this technology will have a great impact.

The speech recognition technology that got Sawyer McFarland working again is becoming more and more common, and surprisingly cheap. Dragon Naturally Speaking, manufactured by Lernout & Hauspie, is the most recognizable of these programs and costs about $60 for basic dictation software. A more advanced version that allows oral editing and mode-less speech – where the user can switch back and forth between dictation and commands like “open Netscape” or “print document,” costs $150-200. But the software doesn’t work well without a high speed computer with a lot of memory, and learning to use the software can be time-consuming and frustrating.

*    *    *    *    *

John’s taxi drops him at Thompson’s office ten minutes early. In preparation for their interview, John puts in a call to his newsroom server. “Go to remote transcription mode,”  he commands.

“OK, John.  What’s the slug?”

“Election.”

“Thank you. Just a moment.”

John heads to the reception area and looks at his watch. One hour to deadline.

“OK, John. Remote transcription mode is ready.”

John speaks some preliminary thoughts into his phone: the date, time, some basic information about Thompson, a description of the office.

A few minutes later, Thompson sticks his head into the waiting area and motions to John. A quick introduction and handshake. “Do you mind if I record this conversation?” asks John.

“Go ahead.”

John places his cell phone face up on the desk between them. “Ready to go,” he says. “Just speak normally.”

By the end of the interview, everything that has been said – John’s notes, his questions, Thompson’s answers, every cough, every um, every sound – will be recorded and sent digitally through the cell phone to the newsroom server halfway across the city. When he returns to the office, a complete transcription of the interview will be waiting for him on his computer. It will be filed under slug election, along with the server’s ever-growing collection of phone calls, interviews, and business meetings – all recorded and transcribed remotely, all accessible by voice.

*    *    *    *    *

Susan Fulton, an administrative editor at the New York Times, has been using speech recognition software since her repetitive strain injury in 1990. “I can run rings around most people using my computer,” she says, “using a combination of keyboard and mouse and speech recognition.” While there are currently only 10 people in her New York office using speech recognition, she expects that number to grow. “If you have one method that’s working, i.e. typing, and you’re accustomed to doing it, there’s absolutely no reason to change to another method unless you know that you’re going to get something out of it,” says Fulton.  “But if the programs become easier to use… then they may eventually find that it is actually faster than typing.”

Learning to use speech recognition software is currently the biggest hurdle to overcome. The training period in which the computer gets to know your voice and your way of speaking only takes five to ten minutes. But getting comfortable with the software, adding vocabulary, and learning how to write and edit aloud can take months.  For a deadline reporter, says Fulton, this is not an option. But if you can find the time to master the techniques of the software, speech recognition can be easier and more efficient than typing. It can even improve your writing.

“The keyboard with the computer is a little bit of a crutch,” says Fulton. “In daily journalism, a lot of times people don’t think before they put something down on paper. They’re using the paper as part of their thought process instead of letting that go forth in their head.” She says speech recognition can train you to think before you write – a lost art in journalism. Essentially, she says, this could eliminate the need for a keyboard and mouse. “Writing isn’t with your fingers. It’s really the brain that’s doing the writing.”

Speech recognition has changed Fulton’s writing style. She’s more concise. She writes with simple phrases and shorter sentences “like you were going to explain something to your mother,” she says. “People used to be able to call their notes back into their office in publishable form. We’re out of practice. We should be able to do this again.”

*    *    *    *    *

The other half of the newsroom of the future is voice interface – talk to the computer and the computer talks back. Currently, these systems are mostly used to bring the internet and all its information to the telephone. Companies like BeVocal and Tellme provide 800 numbers that anyone can call to get driving directions, stock quotes, weather and traffic updates, restaurant, movie and music reviews. Catch up on soap operas, get the horoscope, check the lottery, or play computerized blackjack while sitting in traffic.  (Incidentally, the dealer sounds like Sean Connery.)  There are also computerized operators that will direct your business calls, and automated secretaries that will handle your e-mail, faxes, address book, and calendar.

Bill Byrne has studied linguistics for 12 years and is now working to give computers the power of speech. Working with the speech and language team of General Magic, a Sunnyvale-based company specializing in voice interface systems, Byrne creates computerized personalities that are then sold to businesses. Like a logo or brand, companies take on a voice and a personality to match. “Imagine you’re a company that sells stock information and you want to voice enable all that stuff. If you have one voice, that voice will now be attributed to that brand. The voice in effect is going to say, ‘I am Charles Schwab.’”

Hundreds of businesses are setting up automated operators that will provide company information: airlines, banks, investments firms, auction sites, even department stores.  So far, General Magic has created seven different personalities: there’s the down-to-business money manager – popular with stock and investment firms; the surfer dude who does music reviews, and the dominant friendly female – similar to the voice in the newsroom of the future.

“She’s our most popular voice,” says Byrne. “She is pretty efficient, but also very friendly, very polite, handles things nicely. Very competent. She’s not straight to the point, very terse, and she’s not flowery and dumb either. People love her. She does what you need her to do and she’s also very nice.”

Clifford Nass is a communications professor at Stanford University and has been studying how people interact with technology for 10 years. He emphasizes the importance of a computerized voice matching the personality of the task it is assigned. “Different contexts require different voices,” says Nass. “Just as we expect librarians to sound differently than carnival barkers, and we expect announcers to have a different voice than a doctor or medical site. We expect the same thing of technology. Different car companies have to select different voices to match their brand. A BMW should sound different from a Rolls Royce should sound different from a Chevy.”

Publishers may soon have many more choices to make. In addition to editorial content and the look of the publication, they may have to decide what the publication sounds and acts like.  Is the Wall Street Journal more friendly than the New York Times? Does the San Francisco Bay Guardian have a sense of humor or is it angry and combative? Is the Houston Chronicle a man or a woman? Does he or she have a Texas drawl? Both inside the newsroom and in the public, newspapers and magazines will have a whole new way of defining themselves.

“What it comes down to are the fundamental differences between people And what it is about a voice that carries that,” says Nass. “Things like speed of speaking, pitch range, how high and low you get in a single conversation, how aroused or calm you seem, how deep. All these things, to our brain, suggest information about how I should expect this voice to behave, and in turn, how I will behave towards it.”

*    *    *    *    *

In the cab ride back to the office, 20 minutes before his deadline, John puts in another call to the newsroom.

“It’s me,” he says.

“Hello John. What can I do for you?”

“Ready to file. Slug – Election.” He begins to dictate his copy over the phone from the back seat.

“What’s the word count?” he asks the server.

“Word count is 752 words.”

“Read it back to me, please.”

When the computer is done reading back the copy, John makes a few quick edits and says, “OK. Send it to the editor’s desk.”

“Sure thing. Anything else you need?”

“No. Thank you. See you tomorrow.”

“Bye.”

John powers off his cell phone and sits back in the cab. Forty miles away in the newsroom, his daily story makes its way through the server to the editor’s office at the end of the hall.  The editor is pacing the room, talking into the air to her husband, when she’s interrupted by the server.

“Sorry to bother you, but slug – election just came in. Would you like to look at it?”

“Yes, thank you. I’ll be right there.”

She says goodbye to her husband and sits down in front of her monitor. “OK, computer. Ready to edit election.” The story flickers up on the monitor. She leans back and begins to speak her edits to the computer. The changes appear, one by one, up on the screen until she’s ready to send the story to press. She stretches her arms and clasps her hands behind her head. Another page one story.


Copyright Mia Lobel 2008 .-=-. mialobel@gmail.com .-=-. 415.902.0224