Paper Writing

Tutorial Preview

1

Introduction & Learning Objectives

Learning Objectives Formulate a testable hypothesis for a computer science problem. Design a controlled experiment to evaluate the performance of an algorithm or system. Identify independent, dependent, and confounding variables in a CS experiment. Collect and organize empirical data from software execution. Perform a basic statistical analysis to interpret experimental results. Structure the 'Methods' section of a formal research paper. Differentiate between quantitative and qualitative research methods in a CS context. How do we scientifically prove that a new compression algorithm is actually better, not just 'faster' on one specific machine? 💻 Let's find out! This tutorial will guide you through the formal research methods used in computer sc...

2

Key Concepts & Vocabulary

TermDefinitionExample HypothesisA precise, testable statement about the expected outcome of an experiment. It proposes a relationship between variables.Hypothesis: 'A non-recursive implementation of Merge Sort will have a lower memory footprint than a recursive implementation for datasets exceeding 1 million elements.' Independent VariableThe variable that the researcher changes or controls to observe its effect on another variable.In an experiment comparing sorting algorithms, the independent variable is the 'algorithm used' (e.g., Quicksort, Merge Sort, Timsort). Dependent VariableThe variable that is being measured or tested in an experiment. Its value depends on the independent variable.When comparing sorting algorithms, common dependent variables are 'executi...

3

Core Syntax & Patterns

Controlled Experimental Design Pattern 1. Formulate Hypothesis -> 2. Identify Variables (IV, DV) -> 3. Establish a Control/Baseline -> 4. Run Trials & Collect Data -> 5. Analyze Results This is the fundamental pattern for any performance-based CS research. It ensures that you are isolating the impact of your change (the IV) on a measurable outcome (the DV) by comparing it against a consistent baseline. Algorithmic Benchmarking Protocol For each algorithm: Run N times on the same input -> Discard outliers (e.g., first 'warm-up' run) -> Calculate mean and standard deviation of the performance metric. Use this protocol to get reliable and statistically meaningful performance data. A single run is never sufficient due to system noise. Averaging ov...

4 more steps in this tutorial

Sign up free to access the complete tutorial with worked examples and practice.

Sign Up Free to Continue

Sample Practice Questions

Challenging

You suspect that a new database indexing strategy's performance is being influenced by a confounding variable: the server's disk I/O from a nightly backup process. How would you design a controlled experiment to isolate the effect of the indexing strategy and prove your suspicion about the backup process?

A.Run the tests only during the day and assume the backup process is the cause of any difference seen at night.

B.Increase the sample size of the test runs to over 1000 to average out the noise from the backup process.

C.On an isolated test machine, run the experiment with the new index both with and without a simulated backup process running.

D.Run the tests with and without the new index, but only during the nightly backup window, to measure the combined effect.

Challenging

A research paper compares two search algorithms. The 'Methods' section states: 'We ran Algorithm A on a modern multi-core Intel i9 processor and Algorithm B on an older single-core AMD processor. We ran each test 50 times and found Algorithm A was faster.' Why does this methodology fundamentally invalidate the paper's conclusion?

A.The sample size of 50 is insufficient for a valid conclusion.

B.The authors did not state a clear hypothesis before conducting the experiment.

C.The authors failed to discard the warm-up runs for each algorithm.

D.The experiment violates the 'Ceteris Paribus' principle by using different hardware, a major confounding variable.

Challenging

You are comparing A* and Dijkstra's pathfinding algorithms on a set of 100 identical graphs. For each graph, you run each algorithm 10 times, discard the first run, and average the next 9. Your results show A* has a mean time of 120ms (std dev 5ms) and Dijkstra's has a mean time of 150ms (std dev 25ms). Which is the most robust and accurate conclusion for your paper?

A.On this specific graph type, A* is not only faster on average but also exhibits significantly more consistent performance than Dijkstra's.

B.A* is always 30ms faster than Dijkstra's for any pathfinding problem.

C.Dijkstra's is an unreliable algorithm due to its high standard deviation and should not be used in production systems.

D.The experiment is flawed because both algorithms should have the same standard deviation on identical graphs.

Want to practice and check your answers?

Sign up to access all questions with instant feedback, explanations, and progress tracking.

Start Practicing Free

More from Research Methods

Literature Review Research Design Data Collection Analysis Methods

Tutorial Preview

Introduction & Learning Objectives

Key Concepts & Vocabulary

Core Syntax & Patterns

Sample Practice Questions

More from Research Methods

Ready to find your learning gaps?