data n. 1.资料,材料〔此词系 datum 的复数。但 datum 罕用,一般即以 data 作为集合词,在口语中往往用单数动词;如系指一件资料,则说作 this data〕。 2.〔美国〕(观察所得的)事实,知识。 a data book 参考资料书。 gather data on ...收集…的资料[数据]。 The data is not enough to be convincing. 资料不足,尚难令人信服。
This paper divides the web usage mining process into three main parts : data collecting consists of the three server logs - access , referer , and agent , the html files the make up the site , and registration data ; data preprocessing , includes data cleaning , user identification , session identification , path completion and transaction identification ; data analysis consists of mining frequent access paths and association rule . thinking about the amount of hits on homepage , this paper improves the algorithm of finding frequent items 并结合网页特点,考虑到主页的点击率的影响,对生成频繁访问浏览页的算法做了改进;在web使用挖掘的基础上引入部分web结构挖掘,对挖掘浏览页的关联规则做了补充,在web结构挖掘基础上挖掘出的相关浏览页也推荐给用户,在一定程度上提高了关联规则的精确度。
In the data preprocessing step , by removing redundant data , dispersing inlet and outlet circle , offsetting the blade profile , data format is unified . in the case that measured profiles match theory profiles , each error item is attained by using appropriate arithmetic 数据预处理部分经过叶片理论型线的冗余点剔除,进、出汽边圆弧离散以及理论型线的偏置等步骤,可实现理论型线与实测型线数据格式的统一;最后通过叶片的理论型线和实测型线重合度匹配等算法得到实测叶片的各主要误差项。
In this paper , the methods of eos / modis lib data processing are discussed based on the format and the technical reports of modis ib datasets , including data preprocessing , data extracting , data calibrating and projection transformation etc . some problems in data processing are solved . a method of de _ striping , edf adjusting , is adopted . and the earth location data matching between different resolution pixels are discussed including data interpolating 本文根据eos modis1b资料的格式和内容,详细论述了modis1b数据集的处理方法,包括资料预处理、数据提取、资料定标和投影变换等等,对于数据处理过程中可能存在的一些问题给出了具体的解决方案,重点讨论了定标过程中去除波段图像“条纹”的方法和投影过程中各种分辨率象素点的定位匹配以及由此带来的定位数据插值等问题。
Then the thesis further analyses some core techniques including the system of database , data warehouse and data mining and so on , and presents the frame of function of bank crm . the thesis puts its emphasis on the research on the data preprocessing of data warehouse , data copying , data cleansing , data integration and quality verifying included . finally the thesis discusses the key technology of data warehouse in bank crm - the cleansing of data of customers , and presents some methods of cleansing aiming at noisy values , missing values , conflicting values and duplicated values 本文在充分分析银行crm的需求的基础上,提出了基于数据仓库的银行crm系统的体系结构,并进一步分析了该体系结构中客户数据库系统、数据仓库、数据挖掘等核心技术组件的内涵,给出了银行crm系统的功能构架;重点研究了银行业务系统多年积累的客户数据向数据仓库迁移的预处理方法和过程,其过程包括数据复制、数据清洗转换、数据集成、质量检验和数据装载;最后讨论了银行crm系统应用数据仓库的关键技术:客户数据清洗,给出了针对噪声数据、空缺数据、不一致数据和重复数据的清洗方法。
However , the second technology has the following disadvantages : first , data paging and tmrm generating are integrated into one modulate , next , data structures they used are very complicated and large , in addition , the work of data preprocessing is very heavy and frequently data paging need the server with high performance , at last this paging method is very difficult for implementation . as for the first technology , an important advantage of it is that the data paging and tmrm generating are not interdependent , so it will be more e asily applied in practice than the second one . as an implementation of the first technology , lindstrom introduced a method which uses quadtree and triangle binary tree to organize terrain data and adopt multithread mechanism to realize the data process 而第一种流式处理技术尽管一次调入的数据量稍大,但其数据调度与多分辨率模型的生成在功能上是相互独立的,如果处理得当更容易在实际工程中得到成功的应用,对于它的实现, lindstrom提出了利用四叉树及三角形二叉树进行地形数据组织并利用多线程机制进行数据调度与简化的流式处理方法,但该方法的缺点是:其数据结构依赖于地形的物理分割,因此数据结构庞大;多分辨率模型生成的计算量也依赖于物理分割的粒度,即物理分割粒度较粗时,数据范围增大,计算量会急剧增加;该方法不能实现模型的增量生成。
We discuss several methods of spatial data preprocessing : empty data filling , noise data processing , data reduction , spatial characteristic generalizaion based on attribute induction , and sequent data discretization , which includes common discretized methods , gis attribute data reduction based on rough theory , and attribute data generalization based on cloud theory 2讨论了几种空间数据预处理方法:空缺数据填充、噪音数据处理、数据规约、连续属性的离散化以及基于面向属性归纳的空间特征概化。其中连续属性离散化包括常规离散化方法、基于rough理论的gis属性数据约简和基于云理论的属性数据泛化。
So it is necessary to examine calcium activity and distribution in nerve cells . a way of visualizing intracellular ca2 + in three dimensional was established by using laser scanning confocal microscopy ( lscm ) and computer visualization technique in this paper . based on this way , which includes cell culturing and dyeing , confocal microscopy optimizing , confocal data preprocessing , 3d visualization of ca2 + by computer , we investigated the ca2 * distribution in cultured hippocampal neurons under different objectives 本文通过激光扫描共聚焦显微技术和计算机三维可视化技术建立了一套神经细胞内钙离子分布三维可视化的方法,包括细胞的培养和染色、显微镜参数的优化、共聚焦数据的预处理、针对钙离子的三维可视化方法的实现,为胞内钙离子作用机制的研究提供直观的手段。
Surface reconstruction is one of key techniques in reverse engineering . based on unorganized discrete points come from finite element analysis triangular mesh on exterior surface of the aeroengine hollow turbine blade , this dissertation studied some related problems , such as scattered data preprocessing , quadrangle partition and surface reconstruction technology 曲面重构是逆向工程研究的重要内容之一,本文基于航空发动机空心涡轮叶片外表面的有限元三角网格离散点,对离散数据点的预处理、四边域划分及曲面重构技术进行了研究。
Chapter five discussed the general process of development of data mining statistical information system based on jjms , including design of multi - dimension business space , design of method base , construction of data sources , data preprocessing , system architecture and so on . at last it discussed the effect , shortage , improvement of jjms and the conclusion and elicitation 第五章结合“纪检监察统计分析系统”讨论了将描述数据挖掘技术应用于统计信息系统时的一般思路,包括多维业务空间的设计、方法库的设计、数据源的建设、数据预处理、系统架构设计等内容,最后对系统运行效果、系统的不足和改进、结论和启发进行了论述。
This thesis aims to discuss the clustering techniques with the background of large - scale nuclear physics science data mining . first , we introduce the key techniques and the main task in data mining , then we analyze the data preprocessing techniques and clustering techniques combine data mining techniques with science data . from data preprocessing aspect , we propose some methods of segmenting , denoising , integrating and transforming , and we use “ truncation method ” and “ successive difference method ” in data reduction , at last we extract information from the science data 论文基于大规模核物理科学数据挖掘的背景,全面介绍了数据挖掘的关键技术和主要任务,从理论、算法和应用三个层次,结合科学数据的特点来分析预处理技术和聚类方法,提出了很多实用的预处理方法:对hdf5科学数据进行分块、除噪、集成、变换等,同时对它使用“截断法”和“逐层求差法”进行规约,并对数据进行信息提取。