In our case instead of co clustering documents based on word frequency we co

In our case instead of co clustering documents based

This preview shows page 16 - 18 out of 20 pages.

In our case, instead of co-clustering documents based on word frequency, we co-cluster users based on hashtag frequency within their tweets. To do so we develop W Vt × Vn ,
Image of page 16
where w i , j W Vt × Vn represents the number of time vertex v n , j appears in the twitter stream of v t , i . To co-cluster v t ,1 vt , n we follow the biparitioning algorithm provided in [ 48 ], which results in eigenvector features similar to those we defined in the previous paragraph. The combination of user account attributes, node level metrics from the larger network G , and spectral features explained above provide a rich feature space. Paired with a reasonably sized set of labeled vertices, we can detect an extremist community embedded in social media with supervised classification. If labeling vertices is impractical and node attributes appear informative, vertex clustering methods can be used as in [ 34 ]. Although we implement two different binary classifiers in Section 3, specific algorithms selected for either phase of this methodology are the decision of the researcher. The end result of IVCC, an accurate extraction of vertices A t , facilitates a social network analysis of the OEC of interest. 3 Case study: The ISIS OEC on Twitter To illustrate the utility of our methodology we offer a case study of the ISIS OEC on Twitter. This case study aims to validate our proposed methodology, present its limitations in terms of ethical use, and provide illustrative examples of intelligence that can be mined from OECs. Although the results of our case study provide strong results in terms of accuracy, and we have provided both traditional and sampling based methods for performance evaluation, we stress that we see these methods primarily as a means to understand the interests and behaviors of this OEC. As with any classification technique, false identification of ISIS OEC members must be considered by the practitioner, and using IVCC to support any type of intervention should be used within the context of multiple sources of intelligence. We discuss intended use and the societal implications of similar methodologies in detail in Section 4. 3.1 ISIS data In this section we describe both our collection methods and dataset, but before doing so we would like to clearly state that we have complied with all of Twitter’s terms of service and privacy policies [ 49 ]. We also make no attempts to bind online and offline identity, and have de-identified all users in the data shared in within this manuscript. As a result no ethics or IRB approval was obtained or required. To develop our dataset, we instantiate our sampling strategy with five known, influential ISIS propagandists highlighted in [ 50 ]. In November, 2014 we conducted a two step snowball sample [ 51 ] of these users’ following ties using the Twitter REST API. Snowball sampling is a non-random sampling technique where a set of individuals is chosen as “seed agents.” The k most frequent accounts followed by each seed agent are taken as members of the sample. This technique can be iterated in steps, as we have done in our search. Although this technique is not random and prone to
Image of page 17
Image of page 18

You've reached the end of your free preview.

Want to read all 20 pages?

  • Fall '17

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes