[iwar] [fc:NSF,.Intelligence.Community.Work.on.Data-Mining.Research]

From: Fred Cohen (fc@all.net)
Date: 2002-08-07 06:25:28


Return-Path: <sentto-279987-5137-1028726718-fc=all.net@returns.groups.yahoo.com>
Delivered-To: fc@all.net
Received: from 204.181.12.215 [204.181.12.215] by localhost with POP3 (fetchmail-5.7.4) for fc@localhost (single-drop); Wed, 07 Aug 2002 06:28:07 -0700 (PDT)
Received: (qmail 25774 invoked by uid 510); 7 Aug 2002 13:24:07 -0000
Received: from n24.grp.scd.yahoo.com (66.218.66.80) by all.net with SMTP; 7 Aug 2002 13:24:07 -0000
X-eGroups-Return: sentto-279987-5137-1028726718-fc=all.net@returns.groups.yahoo.com
Received: from [66.218.66.94] by n24.grp.scd.yahoo.com with NNFMP; 07 Aug 2002 13:25:19 -0000
X-Sender: fc@red.all.net
X-Apparently-To: iwar@onelist.com
Received: (EGP: mail-8_0_7_4); 7 Aug 2002 13:25:18 -0000
Received: (qmail 57513 invoked from network); 7 Aug 2002 13:25:16 -0000
Received: from unknown (66.218.66.216) by m1.grp.scd.yahoo.com with QMQP; 7 Aug 2002 13:25:16 -0000
Received: from unknown (HELO red.all.net) (12.232.72.152) by mta1.grp.scd.yahoo.com with SMTP; 7 Aug 2002 13:25:15 -0000
Received: (from fc@localhost) by red.all.net (8.11.2/8.11.2) id g77DPSj28508 for iwar@onelist.com; Wed, 7 Aug 2002 06:25:28 -0700
Message-Id: <200208071325.g77DPSj28508@red.all.net>
To: iwar@onelist.com (Information Warfare Mailing List)
Organization: I'm not allowed to say
X-Mailer: don't even ask
X-Mailer: ELM [version 2.5 PL3]
From: Fred Cohen <fc@all.net>
X-Yahoo-Profile: fcallnet
Mailing-List: list iwar@yahoogroups.com; contact iwar-owner@yahoogroups.com
Delivered-To: mailing list iwar@yahoogroups.com
Precedence: bulk
List-Unsubscribe: <mailto:iwar-unsubscribe@yahoogroups.com>
Date: Wed, 7 Aug 2002 06:25:28 -0700 (PDT)
Subject: [iwar] [fc:NSF,.Intelligence.Community.Work.on.Data-Mining.Research]
Reply-To: iwar@yahoogroups.com
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=3.2 required=5.0 tests=RISK_FREE,FREE_MONEY,DIFFERENT_REPLY_TO version=2.20
X-Spam-Level: ***

NSF, Intelligence Community Work on Data-Mining Research
By Jay Wrolstad
NewsFactor Network
August 02, 2002
&lt;A HREF="http://sci.newsfactor.com/perl/story/18872.html"<a 
href="http://sci.newsfactor.com/perl/story/18872.html">http://sci.newsfactor.com/perl/story/18872.html>

The NSF is working with the CIA's technology branch to develop data-mining 
techniques in order to analyze communications and hopefully prevent terrorist 
activity. The work will involve detection of specific keywords and topics 
across a variety of media.
    
Prompted by homeland security issues brought to the fore by the September 
11th terrorist attacks, the U.S. intelligence community and the &lt;A HREF="http://www.nsf.gov/"National 

Science Foundation&lt;/A (NSF) are researching innovative data-mining techniques 

designed primarily to aid law enforcement agencies at various levels. Some 
US$8 million from the Intelligence Technology Innovation Center (ITIC), which 
is under the Central Intelligence Agency's administrative umbrella but is 
funded separately, will be spent to develop data-mining techniques that can 
extract underlying patterns -- and create predictive abilities -- from 
massive sets of data, such as television broadcasts and Web pages. 

Real-Time Pattern Recognition Gary Strong, program officer for NSF's 
Directorate for Computer &amp; Information Sciences and Engineering (CISE), told 

NewsFactor that the research will involve experts in computer science and 
will focus on two areas: data streams and data sharing. "With audio and video 
streaming there is little hope of saving information because the databases 
are constantly in flux and you have to make real-time decisions on what to 
save," said Strong. Consequently, researchers will work on "mining" 
underlying patterns and trends while pinpointing changes in those patterns. 

This work will involve both topic and word "spotting," or detecting specific 
words or word clusters. Data-Sharing Policies Because the intelligence 
community and law enforcement agencies have traditionally lacked the capacity 
or legal authority to share data, this research will evaluate new policies 
for sharing that incorporate "probable cause" conditions, said Strong. 
Efforts to use government-owned databases in a coordinated way currently 
present problems because of incompatibility among the databases, not to 
mention privacy restrictions. 

Developing data-mining techniques within these constraints is a challenge 
regardless of national security implications, he added. "We now have an 
opportunity to develop a way to allow searches of protected information, such 
as medical records, while protecting privacy of the data," Strong noted. 
Cooperative Agreement Besides national security, other applications for the 
research range from natural disaster response to bioinformatics, which 
involves searching through large numbers of documents to manage biological 
functions. Cooperation between the ITIC and the CIA is made possible through 
the interagency Knowledge Discovery and Dissemination (KDD) program. 

Through KDD, the NSF identifies projects and programs in which research might 
be related to national security and then consults the research community to 
focus its efforts, where appropriate, in that direction. An NSF-sponsored 
workshop held in December identified some 40 potential data-mining projects 
of interest to the intelligence community. Of those, 15 were chosen to 
receive funding over the next three years as part of the cooperative venture. 
Projects Outlined In one chosen project, SRI International will investigate 
ways to enable machines to recognize individuals by the way they talk, a 
sophisticated capability that goes far beyond existing voice-recognition 
technology. 

Strong said this research includes "talk printing," or identifying the 
specific ways in which individuals talk, including pauses or speech 
inflections. In another project, researchers at Columbia University are 
working on a system to track patterns in data types -- such as broadcast news 
programs, online chat rooms, e-mail and voice mail -- and then automatically 
generate a summary of information about a specific event. "They will take 
large numbers of messages and produce short summaries that take into 
consideration both time factors and changing news reports to determine the 
most accurate information," Strong said. Meanwhile, scientists at IBM's T.J. 
Watson Research Center hope to create a topic-spotting method that can search 
for a specific area of interest in all languages.

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Free $5 Love Reading
Risk Free!
http://us.click.yahoo.com/09Lw8C/PfREAA/Ey.GAA/kgFolB/TM
---------------------------------------------------------------------~->

------------------
http://all.net/ 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



This archive was generated by hypermail 2.1.2 : 2002-10-01 06:44:32 PDT