118 Pages

Clustering

Course: CIS 20310, Fall 2009
School: Delaware State
Rating:
 
 
 
 
 

Word Count: 2580

Document Preview

Dragoljub Clustering Pokrajac 2003 Clustering and Unsupervised Learning Clustering is one of algorithms for unsupervised learning Class labels are not known in advance Unlike in classification, clustering models are learned purely based on attribute values What is Clustering ? Given: Set of unlabeled patterns Each pattern contain one or more groups Goal: Group patterns such that patterns...

Register Now

Unformatted Document Excerpt

Coursehero >> Delaware >> Delaware State >> CIS 20310

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Dragoljub Clustering Pokrajac 2003 Clustering and Unsupervised Learning Clustering is one of algorithms for unsupervised learning Class labels are not known in advance Unlike in classification, clustering models are learned purely based on attribute values What is Clustering ? Given: Set of unlabeled patterns Each pattern contain one or more groups Goal: Group patterns such that patterns "similar" to each other are in one group, and "dissimilar" in distinct groups Such distinguished groups are called clusters Example Group these figures according some criteria Attributes: Number of edges Color Clustering by color Clustering by number of edges Some Issues in Clustering The actual number of clusters is not known Potential lack of apriori knowledge about data and cluster shapes Clustering could be performed on-line Time complexity, when working with large amounts of data Types of Clustering Algorithms Partitioning Hierarchical Clusters for large data sets Partitioning Methods Only one set of clusters is created at the output of the algorithm Number of clusters is usually specified Dataset is being partitioned into several groups and groups are updated through iterations K-means, EM, PAM, CLARA, CLARANS... K Means Algorithm Repeat: Randomly initialize K cluster centers Each point assign to the nearest cluster center Re-estimate cluster centers by averaging coordinates of points assigned to each cluster TO DO: Find better pictures (with better initialization) Points To Be Clustered Into Three Clusters 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Randomly Choose 3 cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n1 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n2 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n3 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n4 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n5 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n6 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n7 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n8 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio ra n9 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Recompute Cluster Centers 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Assign Each Point to the Closest Center ite tio 1 ra n 0 4 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 4 5 Problems With K-means Number of Clusters Must Be prespecified Algorithm Sensitive on Initialization May not converge to proper clusters TO DO: show this! Algorithm does not care about the shape of the clusters Algorithm does not care about densities "Blue" cluster much denser than the other clusters EM Algorithm Idea: Each point came from one of Gaussian distributions Goal: estimate parameters of Gaussian distributions Mixture of Gaussian Distributions With probability p1 data came from the distribution D1 determined by: Mean 1, covariance matrix 1, conditional density function p(x|D1) With probability p2 data came from the distribution D2 determined by: Mean 2, covariance matrix 2, conditional density function p(x|D2) ... With probability pK data came from the distribution DK determined by: Mean K, covariance matrix K, conditional density Gaussian Mixture - Formula Similar to well-known formula of total probability, we have formula of "total" probability density... p(x)= p1*p(x|D1)+ p2*p(x|D2)+... pK*p(x|DK) As we know, since conditional distributions are here Gaussians, we have: p x | Dj = ( ) ( 2 ) d det ( j ) 1 e ( x - j ) j -1 ( x - j )T - 2 , j = 1,2,..., K EM Algorithm Details We need to set number of clusters K in advance Consists of two phases Expectation Compute for every pattern probability that is came from each of K clusters, conditioned by observed attributes Maximization Update estimated values for Means of the clusters Covariance matrices of the clusters Cluster priors (probability that a random point belongs to given cluster) EM in Matlab for j=1:n_iterations % "Expectation" phase: Compute for patterns probabilities that they came from % each of K clusters for h=1:K Pmat(:,h)=p(h)*multinorm_distr_value(mu(h,:),sigma{h},X); end Pmat=Pmat./repmat(sum(Pmat,2),1,K); % % "Maximization" phase %Compute new means for h=1:K mu(h,:)=Pmat(:,h)'*X/p(h)/N; %N patterns end %Compute new sigmas for h=1:K XM=(X-repmat(mu(h,:),N,1)).*repmat(Pmat(:,h)).^0.5,1,2); sigma{h}=XM'*XM/p(h)/N; end % Compute new prior probabilities for h=1:K p(h)=w'*Pmat(:,h)/N; end end EM Algorithm- Example 3500 points from mixture of three twodimensional Gaussian Distributions EM algorithm initialized with Distribution means close to true means Covariance matrices equal to covariance matrix on all data Equal priors Each point colored by mixture of three primary colors More clear color: more certain in the cluster membership True Cluster Membership Estimated and Correct Values of Distribution Parameters -Sigmas and priors correctly estimated -Means not correctly estimated - 0.0025353 0.31173 1.0116 ^ ^ 1 = 1 = - 0.0025353 0.29621 - 1.023813 0.060267 - 0.040135 1.0084 ^ ^ 2 = 2 = - 0.040135 0.059549 3.0116 1.0227 ^ = 0.93642 0.77717 ^3 = 3 0.77717 1.0524 1.0034 0.3 0 1 = 0 0.3 0.06 - 0.04 2 = - 0.04 0.06 1 0.8 3 = 0.8 1 3 1 = - 1 1 2 = - 2 1 3 = 1 p1 = 0.57143 p 2 = 0.28571 p 3 = 0.14286 ^ p1 = 0.5701 ^ p 2 = 0.28581 ^ p 3 = 0.14409 Problems with EM Algorithm Slow Convergence depends on the initialization Assumes Gaussian clusters Hierarchical Clustering Set of clusters is created The way clusters are created is depicted by dendograms Agglomerative Divisive Agglomerative Clustering Put each pattern into one separate cluster While there are more than c clusters Merge two clusters closest according to some distance criterion Output c clusters Distance Criterion SINGLE LINK Minimal distance between points in two clusters COMPLETE LINK Maximal distance between points in two clusters AVERAGE LINK Average distance between points in two clusters Single Link Distance Complete Link Distance Average Link Average Distances between all pairs Properties of Single Link Distance Favors elongated clusters These two clusters Are closer than the Others in Single Link Sense Properties of Complete Link Distance Favors Compact clusters These two clusters Are closer than the Others in complete link Sense Single Link - Example 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 Each orange square is initially a separate cluster Single Link distance=0.25 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 We start to merge the closest points Single Link distance=0.5 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 Adding the next point Single Link distance=0.5 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 Constructing new cluster Single Link distance=1 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 Constructing new cluster Single Link distance=3.5355 3 2 5 .7 2 .5 2 5 .2 2 1 5 .7 1 .5 1 5 .2 1 0 5 .7 0 .5 0 5 .2 0 -0 5 .2 -0 .5 -0 5 .7 -1 -1 5 .2 -1 .5 -1 5 .7 -2 -2 5 .2 -2 .5 -2 5 .7 -3 -3 -2 5 .5 .2 -2 -1 5 .5 .2 -1 -0 5 .5 .2 0 0 50 .7 1 1 51.5 1 5 2 2.2 .5 2 5 3 .7 -2 -2 5 .7 -1 -1 5 .7 -0 -0 5 .2 .50 5 .2 .7 52 .7 Merging Dendogram: For each two merged clusters shows Their distance at the moment of merging 5 Distance 4 3 2 1 0 5 .7 0 .5 0 5 .2 0 9 1 0 1 1 8 4 6 5 7 1 2 3 Main Problems Slow Distances be should recomputed O(n2) time complexity Divisive Clustering We start from only one cluster and successively split the clusters into smaller E.g. using Minimal Spanning Tree (MST) Minimal Spanning Tree is tree connecting all edges of the graph such that the sum of vertices is minimal Note: MST can also be used to do single link... Divisive Clustering using MST Consider patterns as vertices of fully connected graph Consider each pair of vertices as connected with edge length equal to the distance between points Compute MST Sort edges of MST in decreasing order While there are remaining edges Form new cluster by deleting the longest remaining edge Divisive Clustering Using MST Original Data Divisive Clustering Using MST Fully Connected Graph Divisive Clustering Using MST Minimum Spanning Tree Divisive Clustering Using MST Make first Two clusters splitting longest edge in MST Divisive Clustering Using MST Make third clusters splitting second largest edge and proceed till 11 clusters are formed Clustering Large Datasets Issues Time complexity vs. Number of patterns Number of attributes Spatial complexity What if not whole dataset can fit in main memory? DBSCAN Density based clusters Time complexity, using special data structure R* trees is O(NlogN) where N is number of patterns Couple of Definitions Core point: pattern in which neighborhood there are more than Nmin patterns Nmin minimal number of patterns in the neighborhood Example: Nmin =9 Core point Non-core point Density Reachable Points Some points density reachable from p1 (via p2 and p3) Core point p3: Neighbor of p2 Neighborhood of p1 Core point p2: Neighbor of p1 Core point p1 Neighborhood of p2 Density Reachable Points Formally... Point q is density reachable from p1 if: p1 is a core point There are some core points p2, p3,...,pM such that p2 is in neighborhood of p1 p3 is in neighborhood of p2 p4 is in neighborhood of p3 ... pM is in neighborhood of pM-1 q is in neighborhood of pM NOTE: q does not need to be a core point! Density- Based Cluster A density-based cluster contains all the points density reachable from an arbitrary core point in the cluster! Idea of DBSCAN Initially, all patterns of the database are unlabelled. BUT: For each pattern, we check whether it is labeled (it can be labeled if it was in some previously detected cluster) If the pattern is not labeled, we will check whether it is a core point, so that it may initiate a new cluster If the pattern has a cluster label, we do not do nothing but instead process next pattern Idea of DBSCAN -Cont If the examined point is a core point, it seeds a new cluster. We observe the neighbors If the neighbor is already labeled, it means it is already examined so we do not need to assign label or to reexamine that Otherwise (neighbor is unlabeled) Each neighbor is assigned a label of a new cluster We recursively examine all core points in the neighborhood DBSCAN - Algorithm DBSCAN: FOR each pattern in dataset IF the pattern is not already assigned to a cluster IF CORE_POINT(pattern)==Yes ASSIGN new cluster label to the pattern EXAMINE (pattern.neighbors) EXAMINE (pattern.neighbors): FOR each neighbor IN pattern.neighbors IF neighbor is not already assigned to a cluster ASSIGN new cluster label to the neighbor IF CORE_POINT(neighbor)==Yes EXAMINE (neighbor.neighbors) ELSE return; Important Note In practical realization, we can avoid having recursive calls Maintain and update the list of all nodes from various neighborhoods that need to be examined NOTE: Instead of list, we could improve performance by using sets (sets do not contain duplicates...) This leads to the following practical, nonrecursive version of DBSCAN Non-Recursive DBSCAN FOR each pattern in dataset IF the pattern is not already assigned to a cluster IF CORE_POINT(pattern)==Yes ASSIGN new cluster label to the pattern ADD pattern.neighbors to the list WHILE list is not empty TAKE neighbor from the beginning of the list (and remove it from the list) IF neighbor is not already assigned to a cluster ASSIGN new cluster label to the neighbor IF CORE_POINT(neighbor)==Yes ADD neighbor.neighbors to the list; END WHILE Remark In addition to the functionality provided in the described algorithm DBSCAN may assign a NOISE label to a pattern Pattern is NOISE if it is not a core point and if it is not density reachable from some core point NOISE patterns do not belong to any class DBSCAN - Example 1 2 1 0 8 6 4 2 0 0 2 4 6 8 1 0 1 2 1 2 1 0 Unclassified Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 Search neighborhood epsilon,=2.5, Nmin=7 4 2 0 -2 0 2 4 6 8 1 0 1 2 First Iteration 1 2 Unclassified 1 0 Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 4 2 0 -2 0 2 4 6 8 1 0 1 2 1 2 1 1 Unclassified 1 0 9 8 7 6 5 4 3 2 1 -2 0 2 4 6 8 1 0 12 Unclassified neighborhood Currently searched Assigned cluster label Noise 1 2 1 0 Unclassified Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 4 2 0 -2 0 2 4 6 8 1 0 1 2 1 4 1 2 Unclassified 1 0 Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 4 2 0 -2 0 2 4 6 8 1 0 1 2 1 2 1 1 Unclassified 1 0 9 8 7 6 5 4 3 2 1 0 2 4 6 8 1 0 1 2 Unclassified neighborhood Currently searched Assigned cluster label Noise 1 2 1 0 Unclassified Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 4 2 0 -2 0 2 4 6 8 1 0 1 2 1 4 1 2 Unclassified 1 0 Unclassified neighborhood Currently searched Assigned cluster label 8 Noise 6 4 2 0 0 2 4 6 8 1 0 1 2 1 4 1 2 1 1 Unclassified 1 0 9 8 7 6 5 4 3 2 1 0 2 4 6 8 1 0 1 2 Unclassified neighborhood Currently searched Assigned cluster label Noise...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Delaware State - CIS - 20310
Incremental Clustering Dragoljub Pokrajac 2003What is Incremental Clustering? Given database of patterns we can perform any of previously introduced clustering algorithms On such a way each pattern is assigned a cluster membership label Q: What
Delaware State - CIS - 20310
#%-12345X@PJL ENTER LANGUAGE = POSTSCRIPT %!PS-Adobe-3.0 %Creator: WordPerfect-6.1 %Pages: (atend) %BoundingBox: (atend) %DocumentNeededFonts: (atend) %DocumentSuppliedFonts: (atend) %LanguageLevel: 2 %EndComments %BeginProlog %BeginResource: WPProcs
Delaware State - CIS - 20310
#%-12345X@PJL ENTER LANGUAGE = POSTSCRIPT %!PS-Adobe-3.0 %Creator: WordPerfect-6.1 %Pages: (atend) %BoundingBox: (atend) %DocumentNeededFonts: (atend) %DocumentSuppliedFonts: (atend) %LanguageLevel: 2 %EndComments %BeginProlog %BeginResource: WPProcs
Delaware State - CIS - 20310
#%-12345X@PJL ENTER LANGUAGE = POSTSCRIPT %!PS-Adobe-3.0 %Creator: WordPerfect-6.1 %Pages: (atend) %BoundingBox: (atend) %DocumentNeededFonts: (atend) %DocumentSuppliedFonts: (atend) %LanguageLevel: 2 %EndComments %BeginProlog %BeginResource: WPProcs
Delaware State - CIS - 20310
#%-12345X@PJL ENTER LANGUAGE = POSTSCRIPT %!PS-Adobe-3.0 %Creator: WordPerfect-6.1 %Pages: (atend) %BoundingBox: (atend) %DocumentNeededFonts: (atend) %DocumentSuppliedFonts: (atend) %LanguageLevel: 2 %EndComments %BeginProlog %BeginResource: WPProcs
Delaware State - CIS - 20310
#%-12345X@PJL ENTER LANGUAGE = POSTSCRIPT %!PS-Adobe-3.0 %Creator: WordPerfect-6.1 %Pages: (atend) %BoundingBox: (atend) %DocumentNeededFonts: (atend) %DocumentSuppliedFonts: (atend) %LanguageLevel: 2 %EndComments %BeginProlog %BeginResource: WPProcs
Delaware State - CIS - 20310
First SIAM International Conference on Data Mining 5 April 2001Tutorial on E-commerce and Clickstream MiningJonathan Becher VP, Product Strategy Accrue Software, Inc.jonbecher@yahoo.comRonny Kohavi Director, Data Mining Blue Martini Softwarero
Delaware State - CIS - 20474
20-474 Telecommunications (Introduction to Computer Networks) Purpose: This course is designed to introduce students to the conceptual, logical and physical concepts of computer networks including application, transport, network and data link layers
Delaware State - CIS - 20474
Overview of protocols Rdt1.0 Reliable channel Rdt2.0 Channel with bit errors is data packets Each packet is transmitted and safely arrives to the destination. However, a packet may have some error, which is being identified by a checksum If checksum
Delaware State - CIS - 20474
DV Algorithm Dragoljub Pokrajac 2003DV Algorithm has 3 phases Initialization of distance table Update of table due to message from neighboring node Update of table due to change of weightsDistance Table At each node i we maintain distance ta
Delaware State - CIS - 20474
FROM HISTORY OF MAIL ADDRESSING D.Pokrajac February 2003 In old days of networks, mailing addresses were not as simple as they are today. Previously, not only we had to specify the computer where to send e-mail, but also we occasionally needed to spe
Delaware State - CIS - 35301
35-301 Bioinformatics Purpose: This course is designed to introduce students to the theoretical and practical concepts of bioinformatics, with emphasis on algorithms and their implementation in bioinformatics software. Goals: a) To expose students to
Delaware State - CIS - 35301
Introduction to Molecular Modeling A Tutorial for RasMol (Revised January 30, 2003)Author: Gale Rhodes Department of Chemistry University of Southern Maine Portland, Maine, USA 04104-9300 Revised by: D.Pokrajac Delaware State University Dover DE 199
Delaware State - CIS - 35301
Critical values for Gaussian distribution Significance 0.05 Z 1.6449 0.025 1.9600 0.01 2.3263 0.005 2.5758 0.001 3.0902Critical values for t distribution Significance 0.01 0.005 0.0011 31.8205 63.656 318.3088 2 6.9646 9.9248 22.3271 3 4.5407 5
Delaware State - CIS - 35301
Classification Techniques Part IIDecision Trees A lot of decision situation can be represented by trees IRS, Pub. 519YesUS permanent resident?NoTax resident Yes Tax residentSubstantially present Here? No Tax non residentYes Internal
Delaware State - CIS - 20420
Classification Techniques Part I D. Pokrajac, 2003K-NN Classifier Idea: Find K closest patterns from the training set Count number of patterns belonging to each class Classify according to the majority voteExample 1: Training SetXNew Exa
UNC Wilmington - CHM - 435
Appendix 1 1. a) b) df c) d) 2. a) b) A B61.43 (df = 3) 3.253 (df = 6) 0.11 2 0.18% omit - 0.28 -0.45 0.019 5 0.61 omit - 0.03 -0.91Note: because all data in both A and B are below the accepted values, this must be due to a determinate error resu
UNC Wilmington - CHM - 435
Chapter 15 1. In an emission spectrum, excitation wavelength is held constant while emission wavelengths are scanned (emission measured as a function of emission wavelength). In an excitation spectrum, emission wavelength is held constant while excit
UNC Wilmington - CHM - 435
Chapter 16 1. a) = (1/2c)(k/)1/2 units of N/m for force constant, kg/atom for , and for to be cm-1, cm/s for c. mass C = (0.01201 kg/mol)(1mol/6.022 x1023 atoms) = 1.99 x 10-26 kg/atom, mass O = 2.66 x 10-26 kg/atom = (1.99 x 10-26 kg/atom)(2.66 x
UNC Wilmington - CHM - 435
Chapter 6 1.a) coherent radiation - EMR with identical 's or sets of 's with constant phase relationships b) dispersion of a transparent substance- transparent means not absorbed - i.e. question refers to normal dispersion - with . In normal dispers
UNC Wilmington - CHM - 435
Chapter 7 FTIR 22. a) f b) f c) f d) f = 2(vM)/ = 2(2.75 cm/s)/(350 x 10-7 cm) = 1.57 x 105 s-1 = 2(vM)/ = 2(2.75 cm/s)/(575 x 10-7 cm) = 9.56 x 104 s-1 = 2(vM)/ = 2(2.75 cm/s)/(5.5 x 10-4 cm) = 1.0 x 104 s-1 = 2(vM)/ = 2(2.75 cm/s)/(25 x 10-4 cm) =
UNC Wilmington - CHM - 435
1. The D-1 is not constant for a prism monochromator but D-1 is almost constant for a grating monochromator. The change in D-1 is compensated for by changing the slit width. The slit width would be larger at lower 's where D-1 is small (better disper
UNC Wilmington - CHM - 435
Chapter 17 1. a) 5.86 m = 1706 cm-1, C=O stretch b) use fig.17-1: CCl4, CHCl3, C2Cl4, cyclohexane. Note: CCl4, CHCl3 are carcinogens and I think C2Cl4 is now on the list also. c) Use: ALOD = ebCLOD = CLOD ; 0.003 = CLOD ; CLOD = 0.015 mg/mL Asample =
UNC Wilmington - CHM - 435
Chapter 8 1. CaOH is a polyatomic species and therefore has vibrational and rotational energies levels that can be smeared to give broad unresolved bands whereas the Ba emission line results from atomic emission which has only electronic energy level
UNC Wilmington - CHM - 435
Chapter 18 2. Boltzmann distribution strikes again! Anti-Stokes lines result from interaction of light with molecules in the first excited vibrational state. As T increases the number of molecules in this state increases. Stokes lines result from gro
UNC Wilmington - CHM - 435
Chapter 9 1. a) releasing agent - cation which preferentially reacts with a species that would otherwise react with analyte to form a compound of low volatility (chemical interference). (p. 244) b) protective agent - prevents formation of a low volat
UNC Wilmington - CHM - 435
Chapter 26 1. a) elution - analyte(s) movement through a column or across a plate by continuous addition of mobile phase. b) mobile phase - gas, liquid, or supercritical fluid used to transport analyte(s) through stationary phase c) stationary phase
UNC Wilmington - CHM - 435
Chapter 10 1. An internal standard is a substance that responds to uncontrollable variables in a similar way to the analyte. It is introduced into both standards and samples in the same fixed amount. Ratio of analyte signal to internal standard signa
UNC Wilmington - CHM - 435
Chapter 27 3. Temperature programming is increasing the T during a GC run to decrease retention time of analytes that are taking to long to come off the column (decrease retention factors). 11. diatomaceous earth (skeletons of diatoms) 12. a) PLOT P
UNC Wilmington - CHM - 435
Chapter 11 2. ICP torch causes atomization and generates ions for MS. Reminder: In ICP-AE the torch causes atomization and excitation of the atoms. Only a relatively few ions are generated so they are not a problem for ICP-AE but there are enough gen
UNC Wilmington - CHM - 435
Chapter 13 1. a. 91.62 2. a. 0.801 3. a. 95.71 4. a. 0.5003 Note for 3 and 4: 3 and 4 assigned to further emphasize logarithmic relation between A and T. 5. A= bC For e, C must be M, and b must be cm. M = (6.23 mg/L)(1g/1000mg)(1 mol/158.03g) = 3.942
UNC Wilmington - CHM - 435
Chapter 20 2. EI leaves the analyte with the most excess E after ionization which causes more fragmentation. This yields a more complex spectrum than either CI or FI because they do not cause as much fragmentation. M+ peak in EI spectrum often missin
UNC Wilmington - CHM - 435
Chapter 14 1. This is a two point standard addition, see p. 17 or p. 376 for formula and p. 17 for derivation of formulaCunk =S1C sVs 0.656 25.7 ppm 10.00 mL 168.592 = = ppm = 21.0 7 = 21.1 ppm ( S 2 - S1 )Vx ( 0.976 - 0.656) 25.0 mL 8.00(lac
UNC Wilmington - CHM - 435
Cyclic VoltammetryTitle Page Abstract Results and Discussion Eo' and comparison to published value (15) o standard deviation in E ' and correct sig. figs. (10) in Eo' n (10) ipa/ipc (10) ipc & ipa vs A (comparison of 2 electrodes) (10) 1/2 ipc & ip
UNC Wilmington - CHM - 435
Exp 6-1, pts. 1 and 2 Title Page Abstract Results and Discussion Part 1 350-625 response (data) 350-625/50 color (data) 600 nm color 600nm 550nm intensity and discussion Graph from Spec 20 data, 3 curves* and mark colors; instrument response for
UNC Wilmington - CHM - 435
Atomic Emission Title Page Abstract Results and Discussion Knowns* Unknowns* table and figure captions and & other errors Results and Discussion Total References Questions 1. 2. 3. Question Total Subtotal Spelling and Grammar Procedural errors Total
UNC Wilmington - CHM - 435
Fourier Transform Infrared Spectroscopy (FT-IR) Title Page Abstract Results and Discussion reference background spectrum pathlength resolution choice cm-1 choice for Beer's Law plot Beer's Law plots from corrected and uncorrected data unknown 2-penta
UNC Wilmington - CHM - 435
Ion Chromatography 2007 Title Page (05) Abstract (10) Results and Discussion Cl- calculation result and its significance (10) qualitative knowns, tr's and peak areas (25) qualitative unknown (10) -2 H and N from SO4 (10) (20) R's and s in qualitativ
UNC Wilmington - CHM - 101
CHM 101/102Fractional CrystallizationGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Fractional CrystallizationPurpose Use the technique of vacuum filtration to separate
UNC Wilmington - CHM - 101
CHM 101/102Fermentation and DistillationGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Fermentation and DistillationPurpose To demonstrate the production of ethanol by t
UNC Wilmington - CHM - 101
CHM 101/102Stoichiometry: Loss of CO2General Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Stoichiometry: Loss of CO2Purpose To apply the concept of limiting reactant to determ
UNC Wilmington - CHM - 101
CHM 101/102Exchange ReactionsGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Exchange ReactionsPurpose To study exchange reactions and learn more about the descriptive ch
UNC Wilmington - CHM - 101
CHM 101/102CalorimetryGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102CalorimetryPurpose To learn techniques for measuring changes in thermal energy (heat) in substances.
UNC Wilmington - CHM - 101
CHM 101/102Aspirin SynthesisGeneral Chemistry 101/102 Laboratory Manual University of North Carolina WilmingtonLaboratory ManualCHM 101/102Aspirin SynthesisPurpose To introduce organic chemistry by synthesizing Aspirin. To review the c
UCLA - ESS - 200
EQUATORIAL BOUNDARIES OF THE EARTH'S MAGNETOSPHERE25 Magnetosheath 20 15 10 -Y(GSM) (Re) 5 0 -5 -10 -15 -20 -25 -20 Rm Rw Plasmasphere Separatrix Bow Shock Magnetopause Synchronous OrbitLecture #17 March 5, 2002 Plasmapause ObservationsProfessor
UNC Wilmington - CHM - 435
AA Lab - Part E and modifications of Parts B and C A. Omitted. B. Quantitative Determination of Copper in a Penny by Atomic Absorption Spectroscopy Preparation of Penny for Analysis After weighing penny, put it in a 125 mL Erlenmeyer flask. Dissolve
UNC Wilmington - CHM - 435
Ion Chromatography (IC) Read section 28F in your lecture textbook (pp. 750-755) and background material on conductance (pp. 58-63 lab text). Make sure you have enough eluent and regenerant solutions prepared from Milli-Q water. The eluent is prepared
UNC Wilmington - C - 445
UNC Wilmington - CHM - 102
CHM 101/102Qualitative Analysis: Group IGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Qualitative Analysis: Group IPurpose To study the chemical properties of Ag+, Pb2+
UNC Wilmington - CHM - 102
CHM 101/102Molecular Weight Determination of Butane GasGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102MW Determination of Butane GasPurpose Determine the molecular weigh
UNC Wilmington - CHM - 102
CHM 101/102Water Hardness: Determination with EDTAGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Water Hardness: Determination with EDTAPurpose To determine the "hardnes
UNC Wilmington - CHM - 102
CHM 101/102KineticsGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102KineticsPurpose To determine the rate law and the value of the reaction rate constant for a chemical re
UNC Wilmington - CHM - 102
CHM 101/102Temperature and Reaction RateGeneral Chemistry 101/102 Laboratory Manual University of North Carolina at WilmingtonLaboratory ManualCHM 101/102Temperature and Reaction RatePurpose To study the effect of temperature on the rea
UNC Wilmington - CHML - 445
Transition metal coordination compounds often contain unpaired d electrons. Whether or not the d electrons are paired, depends on the number of d electrons, the magnitude of d orbital splitting, and the geometry of the ligands around the metal. The m
UNC Wilmington - CHML - 445
Operation of Evans-Johnson Matthey Magnetic Susceptibility Balance1 1. Turn the RANGE knob on the balance to x1 and allow balance to warm up for 30 minuntes. 2. Adjust the ZERO knob until the display reads 000. Zero should readjusted if the range is
UNC Wilmington - CHM - 435
Fluorescence Lab A. Excitation and emission spectra By inspecting the fluorescence of 2 g/mL quinine in 0.05 M H2SO4 caused by the hand held UV source, determine an approximate emission wavelength to use in measuring the excitation spectrum of quinin
UNC Wilmington - CHM - 435
Fluorometric Determination of H2O2 in Water This method to determine H2O2 is based on the reaction of scopoletin, a highly fluorescent molecule, with H2O2 to produce a non fluorescent product. An important aspect of the method is that the reaction is
UNC Wilmington - CHM - 435
Question 14-12 b, 3 measurements of unknownx 4.00 10.0 16.0 24.0 32.0 40.0 y 0.160 0.390 0.630 0.950 1.260 1.580 x^2 16 100 256 576 1024 1600 0 0 0 3572 y^2 0.03 0.15 0.4 0.9 1.59 2.5 0 0 0 5.56 x*y 0.64 3.9 10.08 22.8 40.32 63.2 0 0 0 140.94 ss(res
UNC Wilmington - CHM - 435
Washington - PHYS - 431
Name: Modern Physics Lab Presentation Questionnaire In the space below, and on the back if necessary, write short (2-3 sentence) answers to any three of the questions below. Please circle the numbers of the questions you answer. 1. What is the essen
Washington - PHYS - 431
Nuclear Magnetic Resonance1 BackgroundWhat we call "nuclear magnetic resonance" (NMR) was developed simultaneously but independently by Edward Purcell and Felix Bloch in 1946. The experimental method and theoretical interpretation they developed i