Storm [1] is a distributed event processing framework that is being employed by important companies like Twitter and Groupon. In Storm, a computation is modeled as a topology, that is a directed graph of processing components (spouts and bolts) that communicate each other by sending/receiving events. These components are usually deployed over a cluster of machines in order to increase parellism and improve performance. A relevant issue concerns the fault tolerance for stateful components: how to preserve their states in case of failures? Possible solutions include storing the state in a stable storage [2, 3] and replicating the stateful components.
Provide the detailed design of a realizable strategy to achieve fault tolerance for stateful components in Storm.
In May, version 1.0 of Cloudera Impala has been released [1]. Impala [2] is an open source, distributed SQL query engine for Apache Hadoop that circumvents MapReduce to directly access the data, so as to drastically improve performance. It has been inspired by Google Dremel [3].
Provide a detailed discussion about the key features and technologies employed by Impala, together with an analysis of pros and cons with respect to other Hadoop query engines, like Hive.
References
[1] Cloudera Impala 1.0: It’s Here, It’s Real, It’s Already the Standard for SQL on Hadoop, http://blog.cloudera.com/blog/2013/05/cloudera-impala-1-0-its-here-its-real-its-already-the-standard-for-sql-on-hadoop/
[3] Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis, "Dremel: Interactive Analysis of Web-Scale Datasets", 2010
3
Bonomi Silvia
bonomi@dis.uniroma1.it
Opportunism in Social Networking
During the last years, we assisted to the massive proliferation of mobile devices with ever increasing computational and communication capabilities, such as smart-phones, netbooks, tablets, heterogeneous sensors etc.... This new technological context opened new research directions and new communication paradigms emerged. Among the others, opportunistic networks and opportunistic computing are definitely one of the most promising and interesting one. Among the others, opportunistic behaviors in social networks are quite interesting since they require the adaptation of classical dissemination models to consider selfish behaviors.
The student is required to study and report about dissemination models in social networks (both opportunistic and not). The report must also analyze challenges and/or benefits arising from the presence of opportunistic agents in social networks.
- Anna-Kaisa Pietilänen and Christophe Diot. 2012. Dissemination in opportunistic social networks: the role of temporal communities. In Proceedings of the thirteenth ACM international symposium on Mobile Ad Hoc Networking and Computing (MobiHoc '12). http://dl.acm.org/citation.cfm?id=2248396
- Yahui Wu, Su Deng, Hongbin Huang: Information Propagation through Opportunistic Communication in Mobile Social Networks. MONET 17(6): 773-781 (2012) http://link.springer.com/article/10.1007/s11036-012-0401-3
- Mary R. Schurgot, Cristina Comaniciu, Katia Jaffrès-Runser: Beyond traditional DTN routing: social networks for opportunistic communication. IEEE Communications Magazine 50(7): 155-162 (2012) http://arxiv.org/pdf/1110.2480.pdf
- Ceren Budak, Divyakant Agrawal, and Amr El Abbadi. 2011. Limiting the spread of misinformation in social networks. In Proceedings of the 20th international conference on World wide web (WWW '11). http://dl.acm.org/citation.cfm?id=1963499
4
Bonomi Silvia
bonomi@dis.uniroma1.it
Dynamic Networks: Models and Distributed Agreement Abstractions
Modern distributed systems are characterized by the continuous evolution of entities belonging to the system itself. Traditional models designed for "static" systems must be extended to deal with network topology changes (both in terms of members and communication links). In addition, also protocol implementing common distributed abstractions (e.g. consensus) must be adapted to face the dynamicity of the network.
This assignment can be split in two related sub-task: one student is required to study and report about recent models for dynamic networks;
The second student is required to study and report about new computation abstractions (i.e. consensus protocols, leader election protocols, register protocols etc…) designed for dynamic distributed systems.
- Casteigts, A., Flocchini, P., Quattrociocchi, W., & Santoro, N. (2012). Time-varying graphs and dynamic networks. International Journal of Parallel, Emergent and Distributed Systems, 27(5), 387-408. http://www.tandfonline.com/doi/pdf/10.1080/17445760.2012.668546
- Martin Biely, Peter Robinson, and Ulrich Schmid. 2012. Agreement in directed dynamic networks. In Proceedings of the 19th international conference on Structural Information and Communication Complexity (SIROCCO'12) http://arxiv.org/abs/1204.0641
- Fabian Kuhn, Rotem Oshman, and Yoram Moses. 2011. Coordinated consensus in dynamic networks. In Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing (PODC '11) http://dl.acm.org/citation.cfm?id=1993808&CFID=332219577&CFTOKEN=96580848
- F. Harary and G. Gupta. 1997. Dynamic graph models. Math. Comput. Model. 25, 7 (April 1997)
- Fabian Kuhn, Nancy Lynch, and Rotem Oshman. 2010. Distributed computation in dynamic networks. In Proceedings of the 42nd ACM symposium on Theory of computing (STOC '10) http://dl.acm.org/citation.cfm?id=1806760&CFID=332219577&CFTOKEN=96580848
- R. Baldoni, S. Bonomi, M. Raynal Implementing a Regular Register in an Eventually Synchronous Distributed System Prone to Continuous Churn IEEE Transaction on Parallel Distributed Systems, volume 23, num. 1, pages 102-109, 2012 http://midlab.dis.uniroma1.it/articoli/BBR_TPDS12.pdf
5
Querzoni Leonardo
querzoni@dis.uniroma1.it
Come evolve un sistema su larga scala
Please refer to the set of slides available on the main page of this course for detailed informations on exam rules. Note that the following two lists contain only suggestions. Students are strongly suggested to propose the topics they're most interested in. Striked-though topics have been already assigned and are no more available. For each topic a title and the link to one or more suggested readings on the topic are provided. The list of suggested readings is not exhaustive as reviewing the state of the art is part of your work.
Many of these papers are freely available. Those that require an active subscription can be downloaded from computers connected through the proxy installed at La Sapienza. Check the BIXY service (in italian), or contact me for further details.
Suggested topics:
Byzantine failures, altruistic processes and rational behaviors. A.S. Ayer, L. Alvisi, A. Clement, M. Dahlin, J.P. Martin, and C. Porth. BAR Fault Tolerance for Cooperative Services SOSP 2005.
The hurdles of security in cloud computing platforms. Craig Gentry. Fully homomorphic encryption using ideal lattices. STOC, 2009.
Dependable and secure storage A. Bessani, M. Correia, B. Quaresma, F. Andre and Paulo Sousa. DepSky: Dependable and Secure Storage in a Cloud-of-Clouds. EuroSys 2011.
Consistency and performance of large scale systems W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don’t settle for eventual: scalable causal consistency for widearea storage with cops. SOSP 2011.
C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguica and R. Rodrigues. Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary OSDI 2012.
Read/Write atomic storage in dynamic environments M. K. Aguilera, I. Keidar, D. Malkhi and A. Shraer. Dynamic atomic storage without consensus. Journal of the ACM (JACM), Volume 58 Issue 2, 2011.
From disk- to flash-based storage M. Balakrishnan, D. Malkhi, V. Prabhakaran and T. Wobber. Going beyond Paxos. Microsoft Reserach Technical Report, 2011.
M. Balakrishnan, D. Malkhi, V. Prabhakaran, and T. Wobber. CORFU: A Shared Log Design for Flash Clusters. NSDI 2012.
Array-based data storage P. Brown. Overview of SciDB: large scale array storage, processing and analysis. SIGMOD 2010.
V.-T. Tran, B. Nicolae, G. Antoniu and L. Bouge. Pyramid: A large-scale array-oriented active storage system. LAIDS 2011
Speculative execution in replicated services V. G. Bortnikov, G. Chockler, D. Perelman, A. Roytman, S. Shachor and I. Shnayderman. FRAPPE´: Fast Replication Platform for Elastic Services. LADIS 2011.
Key/Value storage systems H. Lim, B. Fan, D. G. Andersen and M. Kaminsky. SILT: A Memory-Efficient, High-Performance Key-Value Store. SOSP 2011.
D. Beaver, S. Kumar, H. C. Li, J. Sobel and P. Vajgel. Finding a needle in Haystack: Facebook’s photo storage. OSDI 2010.
L. Glendenning, I. Beschastnikh, A. Krishnamurthy and T. Anderson. Scalable Consistency in Scatter. SOSP 2011.
Consistent data storage B. Calder et al. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. SOSP 2011.
J. C. Corbett et al. Spanner: Google’s Globally-Distributed Database. OSDI 2012.
Locality and performance E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell and Y. Suzue. Flat Datacenter Storage. OSDI 2012.
P. Costa, A. Donnelly, A. Rowstron, and G. O'Shea. Camdoop: Exploiting In-network Aggregation for Big Data Applications. NSDI 2012.
Check the following conference proceedings for more topics:
Advanced Topics In Security Of Complex Systems - Suggested Topics
Please refer to the set of slides available on the main page of this course for detailed informations on exam rules. Note that the following two lists contain only suggestions. Students are strongly suggested to propose the topics they're most interested in. Strike-though topics are not available. For each topic a title and the link to a publication where more info can be found are provided.
Suggested topics:
The impact of network topologies on intrusion possibilities and detection probability. The effect of network topology on the spread of epidemics
State of the art for digital certificates.http://www.dis.uniroma1.it/~querzoni/teaching/1213/AdvancedTopicsInSecurityOfComplexSystems/SuggestedTopics
Beyond RSA: elliptic curve cryptography and other methods in the state-of-the-art for public-key cryptography. SEC 1: Elliptic Curve Cryptography
Confidential search: how to search encrypted data Practical Techniques for Searches on Encrypted Data
The hurdles of security in cloud computing platforms. Craig Gentry. Fully homomorphic encryption using ideal lattices. STOC, 2009.
Secure hash functions: from SHA-1 to more secure message digest algorithms. NIST CRYPTOGRAPHIC HASH ALGORITHM COMPETITION
Cryptographic modules: current standards and their implementation in real-world products. FIPS PUB 140-2
Secure mail protocols and legal aspects. Technical rules for the Italian PEC (ONLY IN ITALIAN)
The hurdles of security in cloud computing platforms. Fully homomorphic encryption using ideal lattices
Security for Virtual Currency Bitcoin: A Peer-to-Peer Electronic Cash System
GPU computing vs security: how to make a strong password weak Update: New 25 GPU Monster Devours Passwords In Seconds
Platforms for federated identity management. Build a running demo to test identity federation among several providers: Google, Facebook, Windows Live ID, Shibboleth, Microsoft ADFS. Providxe insights from your implementation activity.
Petroni Fabio
petroni@dis.uniroma1.it
Distributed Collaborative Filtering Techniques
Collaborative Filtering (CF) is one of the most successful approaches to building recommender systems. It uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. Several surveys in literature present an overview of this field [1, 2]. However, most existing CF based recommender systems work in a centralized way. Only few works [3, 4] tackled the problem to implement CF in a distributed fashion. Scope of the project is to investigate the state of the art of distributed CF solutions in order to list the weaknesses/strengths of each approach. References
[1] Gediminas Adomavicius and Alexander Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. Knowledge and Data Engineering, IEEE Transactions on, 17(6):734–749, 2005.
[2] Xiaoyuan Su and Taghi M Khoshgoftaar. A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009:4, 2009.
[3] Peng Han, Bo Xie, Fan Yang, and Ruimin Shen. A scalable p2p recommender system based on distributed collaborative filtering. Expert systems with applications, 27(2):203–210, 2004.
[4] Fady Draidi, Esther Pacitti, and Bettina Kemme. P2prec: a p2p recommendation system for large-scale data sharing. In Transactions on largescale data-and knowledge-centered systems III, pages 87–116. Springer, 2011.
7
Petroni Fabio
petroni@dis.uniroma1.it
Collaborative Filtering with MapReduce
Collaborative Filtering (CF) is the process of identifying similar users and recommending what similar users like. Examples of CF applications include recommending books, CDs and other products at Amazon.com [1], movies by MovieLens [2], and news at Google News [3]. Scope of the project is to implement a CF algorithm using the MapReduce programming model and evaluate its performance on a real dataset. The student is invited to use:
Any different proposal is welcome (a different algorithm) and has to be discussed with the tutor.
References
[1] Greg Linden, Brent Smith, and Jeremy York. Amazon. com recommendations: Item-to-item collaborative filtering. Internet Computing, IEEE, 7(1):76–80, 2003.
[2] Bradley N Miller, Istvan Albert, Shyong K Lam, Joseph A Konstan, and John Riedl. Movielens unplugged: experiences with an occasionally connected recommender system. In Proceedings of the 8th international conference on Intelligent user interfaces, pages 263–266. ACM, 2003.
[3] Abhinandan S Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web, pages 271–280. ACM, 2007.
[4] http://hadoop.apache.org/.
[5] Thomas Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1):89–115, 2004.
A social network must provide to its users a report of the revenue obtained from publishing advertisements on their page.
Three input data sets must be considered:
1. the log of user activity for a campaign in a month. Each row in the log corresponds to one user click and includes the following attributes:
- Advertisement identifier
- Type of product advertised (rough textual description like electronics, outfit, finance, etc.)
- Advertiser Id (the identifier of the owner of the product)
- Publisher Id (the identifier of the user on whose page the ad was published)
- Viewer Id (can be anonymous if comes from outside the social network)
2. the global number of page views per publisher in the month.
3. the list of user-to-user bidirectional connections.
Compute for each publisher the total revenue obtained in the month. The revenue is computed as the CPC (cost-per-click) multiplied by the number of unique clicks per user. The CPC is computed as the number of clicks divided by the number of page views, multiplied by 0.5 if the viewer is anonymous and by 0.3 if the viewer is a neighbor of the publisher in the network.
- Implement a program that randomly generates reasonably “big” data sets
- Implement the MapReduce job(s) for computing the report. This might involve building intermediate data structures, such as adjacency lists.
- Perform a set of experiments regarding performance analysis of the implemented jobs. Report must present the results of the experiments.
- (Optional) Implement a simple web application with a simple search form where by typing the ID of a publisher, the list of actual and “possible” friends are shown.
9
Virgillito Antonino
virgilli@istat.it
People You May Know
Implement MapReduce job(s) for computing the "people you may know" functionality in a social network. For each user, the problem is to find the users that are not directly connected to the user in the social network but are likely to be connected in the real world, according to the relationship between user's direct neighbors.
Two input data sets must be considered:
1. the list of users, including the following attributes:
- User id
- User name
2. the list of user-to-user bidirectional connections.
- Devise a reasonable "may know" criteria (e.g. the first X users that are reachable from a user in two-step connections)
- Implement a program that randomly generates reasonably “big” data sets
- Implement the Map-Reduce job(s) to determine the “may know” list for each user. This might involve building intermediate data structures, such as adjacency lists.
- Perform a set of experiments regarding performance analysis of the implemented jobs. Report must present the results of the experiments.
- (Optional) Implement a simple web application with a simple search form where by typing the name of a user, the list of actual and “possible” friends are shown.
Suggestions for experiments
Experiments mainly (though not necessarily) regards performance analysis of the realized job.Creativity is encouraged.
In the following I suggest some parameters that can be varied in the experiments, though students are expected to focus on those that are more significant with respect to the specific problem:
Cluster size. (Baseline: execution on 1 node.)
Input size. As big as it gets.
Number of maps and reduces.
Input split size.
Specific optimizations on the job and alternative implementations (suggested for single-node implementations).
Alternative input parsing. Example: use a custom RecordReader vs. parsing the input in the Map.
Alternative partitioning solutions.
Example measures:
Overall execution time.
Average time for maps and reduces.
Number of maps execution.
10
Cerocchi Adriano
cerocchi@dis.uniroma1.it
INVESTIGATION ON ENERGY STORAGE TRENDS
Energy efficiency is a key point in a smart grid scenario. Several methods should be used in combination to achieve the goal of decreasing wastes and maximize renewable energy usage for a considered grid. Introducing energy storage within such systems opens to new prospectives on efficiency, with results depending on applied technologies for different scenarios.
New technological trends (e.g. ultracapacitors) on energy storage should be analyzed, indicating the added value they can give. The focus should be on how the efficiency can increase and which problematic issues (phase jumps, costs) can be faced for any considered implementation. A comparison of the goodness of different trends will indicate the best suitable implementations according to functionalities that need to be achieved.
11
Cerocchi Adriano
cerocchi@dis.uniroma1.it
ELECTRICAL SWITCHING IMPLICATIONS
The smart grid concept can use a dynamic energy routing to adapt the current flow to network conditions whereas an advantage can be achieved with it.
Different areas and loads within the network should be supplied by the lines that can ensure the best permormances in terms of efficiency, maximization of local power generation and costs reduction.
In the common switching operations which change a previous confuguration of plugged lines creating a new mapping of those lines, a strong attention should be focused on electrical effects of the switch. Currents, voltages and phases could significantly vary because of different features of the connected lines (inherent electrical characteristics, kind of loads ecc. ecc.). For example, condensators should be dynamically assigned to some lines to avoid the presence of phase shifts (efficiency decreasing).
Investigate on the best trends and possibilities to perform “efficient” switching operations, motivating the requirements for having a switch that can really be defined convinient.
12
Cerocchi Adriano
cerocchi@dis.uniroma1.it
COMMUNICATION PROFILING FOR A SMART ENVIRONMENT (Practical)
Smart networks are made of heterogeneous devices which can be viewed as distributed intelligent agents communicating to each others and performing “smart” tasks. The communication model of the network needs to take latency into account because it can have a significant impact on the higher-level capabilities of a smart scenario.
Latency is 1) not zero and 2) not constant, becoming an inherent parameter which affects performances and reliability of the smart network.
Using a smart infrastructure (provided by us) as test, conduct a profiling of the whole communication model, carachterizing the amount of exchanged messages and delays, underlyining strengths and weaknesses (e.g. bottlenecks) that could significantly affect network functionalities.
13
Montanari Luca/Cerocchi Adriano
montanari@dis.uniroma1.it
Monitoring of power consumpion data
In this work is required to monitor and learn power-related data trend in order to timely recognize variations with respect to normal behaviors. The student should use known Java library that implements Artificial Neural Networks to learn and recognize situations.
In order to present the work, the application must also work off-line using log-files.
14
Montanari Luca
montanari@dis.uniroma1.it
Monitoring of a preexistent Application
The student that has an existing application can discuss with Luca Montanari how monitor this application in order to discover critical pattern or interesting situation during the normal lifetime of the system.
In order to present the work, the application must also work off-line using log-files.
15
Vignola Jacopo
Jacopo.Vignola.1@city.ac.uk
Similarities between Power and Computational Grids
The implementation of smart meters and grids will offer network operators and energy suppliers the ability to observe in real time generation, transportation and consumption flows: this will be possible by overlaying a new data network onto the existing physical grid. The architecture of such data network will represent a key element for enhancing the implementation of smart technologies.
The student is expected to provide a detail analysis of the similarities between characteristics (and objectives) of power grids and the ones of computational grids (and/or other distributed network applications), demonstrating (optional) that a peer-to-peer distributional structure is suitable to exploit smart technologies.
Ref 1: Chetty M. and Buyya R. (2002), “Weaving Computational grids: how analogous are they with electrical grids?”, IEEE
Ref 2: Irving, M.; Taylor, G.; Hobson, P.; (2004), "Plug in to grid computing," Power and Energy Magazine, IEEE , vol.2, no.2, pp. 40- 44
Ref 3: Massoud Amin, S.; Wollenberg, B.F. (2005) , "Toward a smart grid: power delivery for the 21st century" Power and Energy Magazine, IEEE , vol.3, no.5, pp. 34- 41
16
Vignola Jacopo
Jacopo.Vignola.1@city.ac.uk
Simulation tools for distributional network architectures
Simulation has been used extensively for modelling and evaluation of real world systems, from business process and factory assembly line to computer systems design. While there exists a large body of knowledge and tools, few projects have been developed specifically for simulating communications (e.g. SimJava, NS-2, Parsec, P2Psim, PlanetSim, PeerSim) or application scheduling (e.g. Bricks, MicroGrid, Simgrid, GridSim) in grid computing environments. As well, other projects focus on agent-based simulation in different interaction designs (e.g. SWARM, RePast, JAS).
The student is expected to provide a comparative analysis of existing simulation tools in grid computing environments, suggesting (optional) the most suitable one for simulating agent-based economic activities (e.g. trading of a commodity) in a distributional network environment. This coursework may evolve in a dissertation project by using the identified tool to run the simulation of trading a homogeneous commodity in a P2P network environment.
Ref 1: Naicken, Stephen, Anirban Basu, Barnaby Livingston, and Sethalat Rodhetbhai. "A survey of peer-to-peer network simulators." In Proceedings of The Seventh Annual Postgraduate Symposium, Liverpool, UK, vol. 2. 2006.
Ref 2: Niazi, Muaz, and Amir Hussain. "Agent-based tools for modeling and simulation of self-organization in peer-to-peer, ad hoc, and other complex networks." Communications Magazine, IEEE 47, no. 3 (2009): 166-173.
Ref 3: Buyya, Rajkumar. "Economic-based distributed resource management and scheduling for grid computing." arXiv preprint cs/0204048, Monash University, Australia (2002)
17-18
Di Luna Giuseppe
diluna@dis.uniroma1.it
Understanding a real world scale attack: The Carna botnet.
The paper [1] "Port scanning / 0 using insecure embedded devices"
(http://internetcensus2012.bitbucket.org/paper.html) shows how easily is to get a botnet of 420k devices. Specifically, A cross-platform botnet that spreads itself exploiting very old, and extremely effective, trick. Such botnet has been used to run tests against the whole internet space.
These tests include ICMP Ping, Reverse DNS, SYN scan e traceroute. The gathered data and, partially, the source code has been released by the author of the botnet.
The total data released account for about 9 terabytes of data. Data that contains information about the subnet 0/32!
An analysis of these information that goes beyond the results obtained in [1], poses a challenge in handle this extremely huge dataset. Moreover the release of the source code gives to the interested the
amazing possibility to study a software that has done what has not be done by any other human/machine before, An (inconsistent) snapshot of Internet.
Two assigment on this subject could be available:
(A) (Practical):
An interested student could analyze the source code of the carna botnet in order to:
-Obtain a precise idea of the botnet internals. Specifically:the spreading and control mechanism
-Update the botnet source code in order to include to increase its virulency
-Run a simulated botnet over a (not necessary) simulated target
(B) (Practical):
An interested student could analyze the data gathered from carna botnet (Don't worry I already downloaded the 9 terabyte ) in order to:
- Asses if the data are real by means of replicated measure on sampled ipv4 addresses. For example using traceroute and reverse DNS
- Study the data to obtain aggregated information on our university. How many device are accessible from the outside? which kind of services are used? When our university was scanned?
19
Di Luna Giuseppe
diluna@dis.uniroma1.it
The hidden society of hackers: Having fun with IRC, Tor and I2P
IRC (Internet Relay Chat) is a chat system mainly used between 1998-2005. The usage of IRC between the regular audience is decreasing from the advent of instant messaging, and now thanks to facebook IRC is
almost dead. Almost, because IRC is still widely use by a narrow and interesting sub-culture, the one of the hacker. Using hidden networks (Tor,I2P) to hide their identities and sometimes the location of IRC servers hackers are still using irc to coordinate their move and to have "cheap talk" on public channel.
So it is possible for anyone of us (anyone that knows where look) to monitor this conversations. At first glance "cheap talk" could not seems so interesting.... but at the best of my knowledge some of the
major FBI targets have been busted thanks to the leak of personal information using IRC. Some examples are:
-Sabu of lulzsec has been taken down because once he forgets to join irc using Tor
-sup9, the hacker behing the stratfor hack. Has been arrested thanks to the leak of personal info that he did on irc.
Moreover, in my personal experience, I have seen a lot of crazy things on IRC: People that rat outs each other posting personal info on public channel, ip addresses of targets posted in clear view, information about 0 day exploit, link to various online identies, leaking the home ip-address
or the timezone.
An assigment on this subject could be available:
(A) (Practical):
An interested student could write down an IRC Bot (a piece of software that mimicry a real user) in order to:
- collect public reference to links that contains "hot content": password, dump, CCV
- record an history of nick change in order to link more identity
- collect leaks of personal info like: TimeZone, Ip address, tracking information in pasted urls and similar.