thesis-presentation/index.html

607 lines
33 KiB
HTML

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>reveal.js - Slide Transitions</title>
<link rel="stylesheet" href="../dist/reveal.css">
<link rel="stylesheet" href="../dist/theme/ovgu.css" id="theme">
<link rel="stylesheet" href="../plugin/highlight/so-light.css">
<style type="text/css" media="screen">
.slides section.has-dark-background,
.slides section.has-dark-background h3 {
color: #fff;
}
.slides section.has-light-background,
.slides section.has-light-background h3 {
color: #222;
}
</style>
<script defer src="../node_modules/@fortawesome/fontawesome-free/js/fontawesome.js"></script>
<script defer src="../node_modules/@fortawesome/fontawesome-free/js/solid.js"></script>
</head>
<body>
<div class="reveal">
<div class="slides">
<section class="headslide">
<div class="header">
<img style="margin-bottom: 100px!important" src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<h1>Analyzing and Merging of Similar Test Cases in Forked Software Systems</h1>
<p>Johannes Wünsche, 28.1.2021</p>
</div>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div> <div class="center">
<h1>1. Motivation</h1>
</div>
<aside class="notes">
- First Motivation and explain the intention for our proposed technique<br>
</aside>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<img style="height: 500px;" src="../resources/many-authors.png"/>
<p>Fig. 1: Snapshot of Rust Contributor Statistics of the last month</p>
</div>
<aside class="notes">
+ Exemplary Rust Contributors, GitHub statistics from mid-December to mid-January <br />
+ Modern software systems have developed to include large quantity of code and higher quality code is generally more thoroughly tested<br>
+ New Developers -> less familiar with code base -> orient on already implemented behavior <br />
+ Tools like linters keep quality high <br />
+ Often performed operation which tries to undesired patterns in code -> can happen for new developers<br />
</aside>
</section>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h1>2. Related Work</h1>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h3>Technologies</h3>
<div style="display: flex; flex-wrap: wrap; justify-content: space-evenly; align-content: center; height: 700px;">
<div class="center fragment fade-up"><i class="fas fa-clone fa-3x"></i><br>Clone Detection</div>
<div class="center fragment fade-up"><i class="fas fa-search fa-3x"></i><br>Test Case Analysis</div>
<div class="center fragment fade-up"><i class="fas fa-code fa-3x"></i><br>Static Code Analysis</div>
</div>
<p>
<small>
Icons used on this slide are CC-BY 4.0 <a href="https://fontawesome.com/">fontawesome</a>
</small>
</p>
<aside class="notes">
+ Clone Detection: finding of common code structures <br />
&emsp;- work to reduce and indicate clones to assist developers, not restricted to certain kinds of source code <br />
&emsp;- generally differentiated into types of clones, 1,2,3,4 <br />
&emsp;- examples are works like CloneDR, NiCAD, CCFinder <br />
+ Test Case Analysis: Understanding and Optimization of Test Suites <br />
&emsp;- Many works focus on Selection and Prioritization <br />
&emsp;- Aims to better performance and responsiveness for developers <br />
&emsp;- Most used tools to understand tests are runtime effects (call stack tracing & profiling) and coverage <br />
&emsp;- Different metrics are then applied to find similarity values <br />
+ Static Code Analysis: The recognition of patterns in code <br />
&emsp;- analysis of code without compilation or execution <br />
&emsp;- static and predictable execution time, does not require building of software -> system independent <br />
</aside>
</section>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h1>3. AST-based Test Similarity</h1>
<aside class="notes">
+ Now to our technique proposed <br />
+ We give here an overview of the complete process and core concepts applied <br />
<aside>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<img style="height: 650px;" src="../resources/overview-1.svg"/>
<p>Fig. 2 Filtering, Treatment, and Matching of ASTs</p>
</div>
<aside class="notes">
+ pipeline like approach for sets of data performing: <br />
&emsp;- selection of source code, and test methods <br />
&emsp;- generalization is transformation of trees to find higher type clones: type-1 clones, type-2 (we split this into two groups)
<br />
&emsp;- on each of them we find different characteristics which allow us to perform different actions <br />
+ Comparison is based on hashed lists of nodes we try to find nodes which include this lists: <br />
&emsp;- results in list of nodes pair which are equal on the left and right side <br />
</aside>
</section>
<section data-auto-animate style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div data-id="a" class="bar"></div>
</div>
<div style="display: grid; grid-template-row: auto auto; grid-template-column: auto; justify-items: center;">
<img src="../resources/base.svg">
<pre data-id="code-animation"><code data-trim data-noescape class="c"><script type="text/template">
// test.c
#include <stdio.h>
void helloworld() {
char* msg = "Hello World";
char* line_end = "\n";
printf("%s%s", msg, line_end);
}
int main() {
helloworld();
return 0;
}</script></code></pre>
</div>
</section>
<section data-auto-animate style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div data-id="b" class="bar"></div>
</div>
<div style="display: grid; grid-template-row: auto auto; grid-template-column: auto; justify-items: center;">
<img src="../resources/fork.svg">
<pre data-id="code-animation" style="height: 570px;"><code data-trim data-noescape class="c"><script type="text/template">
// test.c
#include <stdio.h>
void helloworld() {
char* msg = "Hello World";
char* line_end = "\n";
printf("%s%s", msg, line_end);
}
void printmore() {
// Include other worlds too
char* msg = "And to all the other worlds!";
char* line_end = "\n";
printf("%s%s", msg, line_end);
}
int main() {
helloworld();
printmore();
return 0;
}</script></code></pre>
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div data-id="b" class="bar"></div>
</div>
<div style="display: grid; grid-template-row: auto auto; grid-template-column: auto; justify-items: center;">
<img style="height: 150px;" src="../resources/step-1.svg">
<pre style="height: 540px;"><code data-trim data-noescape class="c"><script type="text/template">
diff --git a/test.c b/test.c
index 4a1f225..79be69b 100644
--- a/test.c
+++ b/test.c
@@ -6,7 +6,14 @@ void helloworld() {
printf("%s%s", msg, line_end);
}
+void printmore() {
+ // Include other worlds too
+ char* msg = "And to all the other worlds!";
+ char* line_end = "\n";
+ printf("%s%s", msg, line_end);
+}
+
int main() {
helloworld();
+ printmore();
return 0;
}</script></code></pre>
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div data-id="b" class="bar"></div>
</div>
<div style="display: grid; grid-template-row: auto auto; grid-template-column: auto; justify-items: center;">
<img style="height: 150px;" src="../resources/step-3.svg">
<img src="../resources/code.svg">
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div data-id="b" class="bar"></div>
</div>
<div style="display: grid; grid-template-row: auto auto; grid-template-column: auto; justify-items: center;">
<img style="height: 150px;" src="../resources/step-4.svg">
<img src="../resources/general.svg">
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center r-stack">
<img class="fragment fade-out" data-fragment-index="0" style="height: 650px;" src="../resources/overview-2-1.svg"/>
<div class="center fragment fade-in-then-out" data-fragment-index="0" style="background: white; padding: 50px; border: 5px solid black; border-radius: 20px;">
<img src="../resources/sequence.svg">
<p>Fig. 4 Example Sequence</p>
</div>
<img class="fragment fade-in-then-out" style="height: 650px;" src="../resources/overview-2-2.svg"/>
<img class="fragment fade-in-then-out" style="height: 650px;" src="../resources/overview-2-3.svg"/>
</div>
<div class="center">
<p>Fig. 3: Pipeline on Found Matches</p>
</div>
<aside class="notes">
+ Filter on the results to detect characteristics: <br />
&emsp;- Statements to find sequences like loops, conditional statements, blocks and expressions <br />
&emsp;- Blocks in specific, allow for replacement and works on partial type-2 clones <br />
&emsp;- All, is a catch-all for type-2 clones to find similarities between methods with a general measure of confidence <br />
+ Sequences are then analyzed (show this on next slide) and maximized <br />
+ Blocks, search for equal methods, this is equivalent to finding equal test case <br />
+ All, with the measure of similarity, largest overlap defined by the amount of nodes which are in common, if this represents more than 95% of nodes in the method they are approximately equal <br />
+ a report will be generated in the end in a machine readable format (JSON), containing warnings and done modifications as well as their locations <br />
</aside>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<img style="height: 650px;" src="../resources/overview.svg"/>
<p>Fig. 5: Process Overview showing the operational structure</p>
</div>
<aside class="notes">
+ shows an overview of entire process <br />
+ this may be used in the workflow if run in CIs<br />
+ for example in bots for automated reports in mailinglists or project organization <br />
</aside>
</section>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h1>4. Evaluation</h1>
<aside class="notes">
+ Evaluation of prototype <br />
+ implemented in kotlin for java source code <br />
</aside>
</section>
<section>
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h3>Research Questions & Goals</h3>
<div class="center" style="align-items: start; height: 650px;">
<p>
1. Are results found with this technique valid?
</p>
<p>
2. Can they be used to reduce the duplication of test code?
</p>
<p>
3. Is extraction of common sequences a practical goal in the context of PRs?
</p>
<p>
4. Is the Java Testing environment fitting to the technique?
</p>
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center" style="margin-top: 150px;">
<div style="display: grid; grid-template-rows: repeat(8, auto); grid-template-columns: repeat(3, auto); gap: 15px 50px;">
<div style="grid-row: 1/1; grid-column: 1/4; height: 5px; background: black;"></div>
<div>Repository</div>
<div>No. of Test Cases</div>
<div>SLOC</div>
<div style="grid-row: 3/3; grid-column: 1/4; height: 3px; background: black;"></div>
<div>jackson-databind</div>
<div style="justify-self: end">2667</div>
<div style="justify-self: end">198650</div>
<div>mockito</div>
<div style="justify-self: end">2018</div>
<div style="justify-self: end">8851</div>
<div>junit5</div>
<div style="justify-self: end">3653</div>
<div style="justify-self: end">134727</div>
<div>javaparser</div>
<div style="justify-self: end">2371</div>
<div style="justify-self: end">272627</div>
<div>guava</div>
<div style="justify-self: end">1071</div>
<div style="justify-self: end">756059</div>
<div style="grid-row: 9/9; grid-column: 1/4; height: 5px; background: black;"></div>
</div>
<p> Table 1: Overview of Selected Projects</p>
<aside class="notes">
+ Short overview of selected projects <br />
+ differentiating in size but all rather large repositories with more or less extensive testing <br />
</aside>
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<pre style="height: 690px; font-size: 0.5em;"><code data-trim data-noescape data-line-numbers="1-27|4,18|7,21|12,26" class="java"><script type="text/template">
@Test
void readsLineFromDefaultMaxCharsFileWithDefaultConfig(@TempDir Path tempDir) throws Exception {
Path csvFile = writeClasspathResourceToFile(
"/default-max-chars.csv", tempDirresolve("default-max-chars.csv"));
CsvFileSource annotation = csvFileSource()
.encoding("ISO-8859-1")
.resources("/default-max-chars.csv")
.files(csvFile.toAbsolutePath().toString())
.build();
Stream<Object[]> arguments = provideArguments(
new CsvFileArgumentsProvider(), annotation);
assertThat(arguments).hasSize(2);
}
@Test
void readsFromClasspathResourcesAndFiles(@TempDir Path tempDir) throws Exception {
Path csvFile = writeClasspathResourceToFile(
"/single-column.csv", tempDir.resolve("single-column.csv"));
CsvFileSource annotation = csvFileSource()
.encoding("ISO-8859-1")
.resources("/single-column.csv")
.files(csvFile.toAbsolutePath().toString())
.build();
Stream<Object[]> arguments = provideArguments(
new CsvFileArgumentsProvider(), annotation);
assertThat(arguments).hasSize(2 * 5);
}</script></code></pre>
<p>Fig. 6: Example of found Test Case Duplication</p>
</div>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center" style="margin-top: 150px;">
<div style="display: grid; grid-template-rows: repeat(8, auto); grid-template-columns: repeat(4, auto); gap: 15px 50px;">
<div style="grid-row: 1/1; grid-column: 1/5; height: 5px; background: black;"></div>
<div>Repository</div>
<div>Equal Test Cases</div>
<div>Similar Sequences</div>
<div>High Test Similarity</div>
<div style="grid-row: 3/3; grid-column: 1/5; height: 3px; background: black;"></div>
<div>jackson-databind</div>
<div style="justify-self: end">0</div>
<div style="justify-self: end">11</div>
<div style="justify-self: end">559</div>
<div>mockito</div>
<div style="justify-self: end">0</div>
<div style="justify-self: end">30</div>
<div style="justify-self: end">55</div>
<div>junit5</div>
<div style="justify-self: end">0</div>
<div style="justify-self: end">10</div>
<div style="justify-self: end">2412</div>
<div>javaparser</div>
<div style="justify-self: end">0</div>
<div style="justify-self: end">69</div>
<div style="justify-self: end">9</div>
<div>guava</div>
<div style="justify-self: end">0</div>
<div style="justify-self: end">18</div>
<div style="justify-self: end">154</div>
<div style="grid-row: 9/9; grid-column: 1/5; height: 5px; background: black;"></div>
</div>
<p> Table 2: Numerical Representation of Results</p>
</div>
<aside class="notes">
+ Complete Overview of occurrences in the evaluation <br />
+ First notice no merging -> this is due to the strict requirements <br />
&emsp;- these have been set to prevent disruption of test case annihilation <br />
&emsp;- possible candidates likely to be excluded in the review process<br />
+ High Number of Similar Methods in some pull requests e.g. junit5 <br />
&emsp;- We show later an example where this could be found, and explain how these great amounts happened in junit5 <br />
+ Sequences number is expected level <br />
+ Next look is on relation to the length on average for these occurrences <br />
</aside>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center" style="margin-top: 150px;">
<div style="display: grid; grid-template-rows: repeat(8, auto); grid-template-columns: repeat(3, auto); gap: 15px 50px;">
<div style="grid-row: 1/1; grid-column: 1/4; height: 5px; background: black;"></div>
<div>Repository</div>
<div>Similar Sequences</div>
<div>High Test Similarity</div>
<div style="grid-row: 3/3; grid-column: 1/4; height: 3px; background: black;"></div>
<div>jackson-databind</div>
<div style="justify-self: end">3.0</div>
<div style="justify-self: end">4.43</div>
<div>mockito</div>
<div style="justify-self: end">3.0</div>
<div style="justify-self: end">11.71</div>
<div>junit5</div>
<div style="justify-self: end">3.39</div>
<div style="justify-self: end">3.11</div>
<div>javaparser</div>
<div style="justify-self: end">3.0</div>
<div style="justify-self: end">11.34</div>
<div>guava</div>
<div style="justify-self: end">3.0</div>
<div style="justify-self: end">3.06</div>
<div style="grid-row: 9/9; grid-column: 1/4; height: 5px; background: black;"></div>
</div>
<p> Table 3: Average Length (in Lines of Code) of found Duplication</p>
</div>
<aside class="notes">
+ Omission of merging as there were no matches there <br />
+ sequences are rather short <br />
+ found were builders and short patterns in methods like in similar methods <br />
+ methods mixed some also short, some longer <br />
+ on average longer matches were more applicable <br />
+ many matches in short methods contained standard methods like getters and setters of objects defined in test code e.g. in jackson-databind and guava <br />
</aside>
</section>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<h1>5. Conclusion</h1>
</div>
</section>
<section>
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h3>Lessons Learned</h3>
<div class="center" style="align-items: start; height: 650px;">
<p class="fragment">
- ASTs well suited for Test Case Analysis
</p>
<p class="fragment">
- Extraction is relatively complex and requires further analysis
</p>
<p class="fragment">
- Initial tests deliver good results but need more refinement
</p>
<p class="fragment">
- Viability of the technique
</p>
</div>
<aside class="notes">
+ Viable Technique as the combination of both topics mixes well <br />
+ First results contain valuable matches and insights <br />
+ Static Analysis is a complex topic and as expected the implementation of complex and edge case likely constructs is considerable <br />
+ Edge cases in extraction require in-depth knowledge <br />
</aside>
</section>
<section style="height: 935px;">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<h3>Open Topics</h3>
<div class="center" style="height: 700px;">
<div style="display: grid; grid-template-rows: repeat(2, auto); grid-template-columns: repeat(3, 400px); justify-items: center; gap: 30px 60px;">
<div class="fragment fade-up" data-fragment-index=1>
<i class="fas fa-external-link-alt fa-3x"></i>
</div>
<div class="fragment fade-up" data-fragment-index=2>
<i class="fas fa-fingerprint fa-3x"></i>
</div>
<div class="fragment fade-up" data-fragment-index=3>
<i class="fas fa-language fa-3x"></i>
</div>
<div class="fragment fade-up" style="text-align: center;" data-fragment-index=1>Sequence Extraction</div>
<div class="fragment fade-up" style="text-align: center;" data-fragment-index=2>Context Identification</div>
<div class="fragment fade-up" style="text-align: center;" data-fragment-index=3>Validation in other Languages</div>
</div>
</div>
<p>
<small>
Icons used on this slide are CC-BY 4.0 <a href="https://fontawesome.com/">fontawesome</a>
</small>
</p>
<aside class="notes">
+ few open topics <br />
+ first, automatic sequence extraction, for now we detect sequences, extraction is possible but additional information from type resolution is required <br />
+ second, method context identification, recognition of test objects, this can be done multiple ways, possible restriction to just test cases or detection of classes implementing a test environment <br />
+ test integration, can be done as part of the identification, be sure which methods are actual test cases and which do not fulfill this requirement, and maybe be aware of testing used OOP complicates this here<br />
+ third, validating the technique in other language environments
</aside>
</section>
</section>
<section data-transition="fade">
<section class="headslide">
<div class="header">
<img src="../resources/ovgu.svg"/>
<div class="bar"></div>
</div>
<div class="center">
<h1>Thanks & QA</h1>
</div>
</section>
<section>
</section>
</section>
</div>
</div>
<script src="../dist/reveal.js"></script>
<script src="plugin/zoom/zoom.js"></script>
<script src="plugin/notes/notes.js"></script>
<script src="plugin/search/search.js"></script>
<script src="plugin/markdown/markdown.js"></script>
<script src="plugin/highlight/highlight.js"></script>
<script>
Reveal.initialize({
center: true,
history: true,
width: 1600,
height: 900,
transition: 'none',
// transitionSpeed: 'slow',
// backgroundTransition: 'slide'
plugins: [ RevealZoom, RevealNotes, RevealSearch, RevealMarkdown, RevealHighlight ],
});
</script>
</body>
</html>