XSEDE 2016 Malaria Project submitted by Brian Bartoldson, Eitan Lees, Ian McCann, Amirhessam Tahmassebi and Alex Townsend Reviewed by Wolverine Here are some comments/suggestions on the project, not necessarily in order of relevance or appearance. i) Writing comments: The "standard" way of organizing a paper's section is 1) Introduction: what this paper is going to be about and why are we interested in it. 2) Problem description: what's the model/problem, what are its properties, what are the quantities involved. If the model is "small", consider writing the equations, otherwise, refer to some source (book, article or webpage, in order of importance) and maybe write only 1-2 pieces that are most relevant to you and your work or that you will recall often later and/or in the result section. 3) Algorithm description: how are we going to solve the problem? Describe the algorithm, perhaps spending a couple of words on the 1-2 most important parts (such as bottlenecks, crucial parts, innovative strategies, etc.). Do not go into too many details, but don't assume the reader knows how to solve the problem. 4) Results: show 1+ testcases. If it's a realistic testcase, say why, otherwise explain why a 'manifactured' test is still useful. Be sure to motivate why the test you are making is important and worth doing (e.g., why are you computing sensitivities?) 5) Conclusions: recap (quickly!) what you've achieved, underlining what you added to previously existing work (if any) and if possible depict possible future improvements to your work. Part 4 and 5 should be separated. It is important to have a dedicated section to "sum things up". I would not introduce a new upgrade to the model in the results section (as you did in section 5). Instead, In section 2 (problem/model description) I would add a section dedicated to the "upgraded" model. You could start section 2 with the "basic" model, and, in subsections, present variations to it. But once you start with results, it's probably nicer to not go back on the structure of the model. If the "upgrade" to the model is somewhat suggested by poor results, in the model section you can anticipate that the basic model will not be good enough, referring the reader to the results section when you will show it in details. Part 1 is almost always called "introduction", cause it anticipates what's in the paper (all of it). A name "Malaria" made me think you were only going to describe the disease while instead you talked also about mathematical models. Also, malaria is not a person, so don't write "Malaria" (no uppercase). The structure of the intro was good, but it failed to say what YOU were going to do in the paper. You described the disease (good), you mentioned the fact that mathematical models woudl help (good), but you did not say that you were going to analyze, test, and implement one (or more) such models in this paper. Always say what you're going to do. Part 2 (model description) is probably too short. You can spend a few more words on the model, perhaps adding a picture of the model diagram given in the pdf file. After presenting the assumptions (absence of latent infection period, immunization) and introducing the variables (what populations are there), you should describe briefly what are the relations among the variables (the "equations" that constitute the model). Note, you do not need to list all the equations, but at least try to summarize the spirit of them (e.g., balance). As a plus, you could mention what the remove of one or more of the assumption would have on the model. For instance, what additional population would you have to include if you consider a non-zero latent period of infection? This helps understand what's the role of your assumptions. The python implementation is an important part of your paper. It should probably have its own section, rather than being stacked with the description of the model. In such section, you may give some details about the code, including packages, type of language, what was already present and what you added/modifying. For instance, it was good to explain why you changed the code to enforce non-negative population sizes. Miscellanea: - do not use "vague" and/or colloquial quantifying adjectives/adverbs, such as "basically" or "a little more". Instead, use more professional terms, such as "more specifically", or "greater". - avoid phrasal verbes, using more professional and specific on-word verbs (e.g., make up can be replaced with "constitute") - be more assertive about the reasons why you obtained some results. E.g., do not say "maybe X is because of Y", but, instead, show that you believe it! Like "we believe X is caused by Y" or, even more categorical, "X is caused by Y". Solid explanations or strong feelings tend to convince the reader more than a mere "maybe". If you really are not sure, still try: "one possible explanation for X is Y". - do not refer to the reader directly (e.g. "as you can see", in sec. 4.1). Instead, use instead "as it can be seen from figure X", or "as we can see from figure X". - avoid referring to figures as "the next figure" or "the plot below". Try to give a number to figure/tables and refer to them as "figure 3" or "table 1". - avoid also expression such as "I would say that X is the best choice". Instead, prefer expressions such as "a reasonable choice is X" or "the figure suggests X is a reasonable choice". - avoid commenting figures in their description. Instead, interpret the results in the section, while in the figure caption limit to simply describe what is in the figure. - if you use caption for figures (and you should), use it on all of them. Be consistent. - you present results on sensitivities before showing anything else. Perhaps it is clearer to show the model results first, and then study the sensitivity. It is hard to understand the sensitivity plots before having seen any model output at all. - section 5.2: "according to the seasonal increase in...". "According" suggests that what is next is a model, an opinion, a theory or a law. Not a fact. Perhaps a better wording would be "Since in reality there is a seasonal increase in..." ii) Coding/algorithm comments - Try to add as many comments as possible. If possible, use self-explanatory names for variables. If not possible or not convenient (perhaps name too long), add a comment that explains what the variable is. E.g., in sensitivity.py, what is bro, what is brs? Or in extra_part3.py, what are treatedBRFM or newBRFM? - You said (section 3) that the algorithm is very similar to Forward Euler. It does not look similar to FWD Euler to me ("time_step" is never explicitly used in the update of the quantities). You need to explain in what sense it's "similar". Also, look at this sentence: "is an explicit method for solving models such as this in the sense that each can be directly solved using information from the previous timesteps and does not require solving a nonlinear system with dependence on the current time at each timestep" First, be careful. Explicit means not implicit (clearly), but implicit is not necessarily nonlinear. The nonlinearity is a feature of the problem not the algorithm (although explicit methods never leads to solve nonlinear system). Explicit/implicit is a feature of the algorithm. Second, "models such as this" is quite vague. What kind of models are you referring to?