Statistical Analysis Syntax: Documenting Every Step You Take and Why
When I first learned how to program statistical analyses, we all had to use a central system called the VAX. I know . . . I KNOW . . . , I am an elderly ancient species. However, as I lingered on working my way through my graduate program, these statistical analysis programs became available to run on our personal computers, and they also became easier and easier to program. These new programs launched with the inclusion of point and click options users were no longer required to laboriously write the syntax and hunt for errors—like missing periods in SPSS, the need to master syntax decreased. Now, it’s so easy; if you want to see what an analysis of variance would look like, you can point and click and run some data in about 10 seconds—even when the analysis is not logical. Your amazing computer will still spit out whatever you have asked it to compute.
Some of the lessons I learned came through making huge mistakes that cost me time, effort, money, and a lot of tears. The number one lesson may be that even though the point and click certainly makes life easier, it can create other problems. One of the challenges of easy programming originates from solving problems on the run. For example, you may realize halfway through your analyses that you need to subset the data for those participants who completed surveys A, B, and C, but not D. You quickly filter out the data, point and click, run the analyses, and don’t keep any record of HOW you subset your data for A, B, and C, but not D. It may make a difference in the number of people you keep in your study if you first select out D and then include A, B, and C. Unless you maintain an excellent record of decisions like this in your syntax, the natural “fly-by-the-pants” point and click options can create massive headaches when you are trying to write up your data analysis procedures, or even worse, when you are trying to publish your study.
Even if you keep a record of your data management decisions by documenting your syntax, this is still not the end-all-be-all solution that solves all documentation problems. Spaghetti anyone?!? In the past, chefs verified the quality of cooked spaghetti by throwing is against a wall. If it stuck, the spaghetti was ready to consume—is this a myth? Who prefers sticky spaghetti? I hear you asking, what in the $#^@ does spaghetti have to do with syntax? Both words begin with an S, we know . . . but spaghetti programming in syntax is to be avoided at all costs. It is a sticky maze of unorganized logic.
Avoiding spaghetti programming is the reason experienced researchers generate and maintain a well-documented flow of syntax organized in a time-ordered and logical manner. For example, you would not calculate your summary variables after you have run your regression analyses using those summary variables. Therefore, place all of your preliminary data management tasks like recoding variables and calculating summary scores at the beginning of the syntax file.
Comment, comment, comment on every chunk of syntax—so that you can read what you did 5 years later and understand the logic behind why you made specific choices. During the excitement of data entry and analysis, it is easy to make decisions you are confident you will recall in the future. Perhaps your memory is a steel vault! Most of us find ourselves trying to find notes or remember the rationale or steps completed in the past. Below are common areas in which this documentation is useful.
• Why did you delete certain respondents? Answer: Respondents were deleted if they were missing more than 75% of the items for a particular scale.
• Why did someone get retained in the study when income is a critical variable, and they did not report their income? Answer: You decided that you could use a proxy for low income like qualifying for food stamps. In this way, you can treat the syntax file like an audit journal so that anyone can replicate your study findings later—including you!
• Data management decisions – When calculating summary scores, checking for scale meaning of number direction, e.g., does a larger number mean more? If not, recode, so it does—ease of interpretation reduces confusion.
• Any data imputations and why. Collapsing data like “other” specify what is placed into existing categories.
• Before answering research questions run data to examine scale reliability and variable characteristics to verify if the statistical assumptions of normality are met.
• If the assumptions are not met, include any logic about how the variables were mathematically adjusted OR if you are ignoring these violations of assumptions.
• Explain how is missing data handled.
• Organize syntax by research question to document analyses per research question
I hope this blog provided new tips or served as a refresher. Planning ahead, using comments, and tracking your decisions will save time and anxiety in the future.
It may also prevent hair loss!
If you would like to know more, schedule a consultation today!
Direct quotes provide potent examples to highlight an argument or make a point. Too often, direct quotes are used in place of good writing. Executive reports, grants, dissertations, course papers, and even blogs frequently rely on direct quotes in place of summarizing and paraphrasing. In high school and undergraduate work, direct quotes may be a requirement of the assignment. This requirement is seldom the case in graduate or professional writing.
When is it appropriate to use a quote? Generally, a quote is necessary when the essence or power of the quote will be lost if summarized or paraphrased. Below are examples of quotes in this category.
Live as if you were to die tomorrow. Learn as if you were to live forever. Mohandas Karamchand Gandhi
We become not a melting pot but a beautiful mosaic. Different people, different beliefs, different yearnings, different hopes, different dreams.
I have found the paradox that if I love until it hurts, then there is no hurt, but only more love. Mother Teresa
Science is not only a disciple of reason but, also, one of romance and passion.
The presentation or discussion of prior research should not contain direct quotes. In other words, there is no need to quote “60% of Americans prefer the color blue, while 30% prefer white and 10% had no preference”. This example is simplistic but accurate. Stay tuned for additional tips and resources on paraphrasing and summarizing.
Stay tuned for details about our 2019 workshops!
Evaluators and researchers have an obligation to create, implement and report findings with integrity and honesty. The American Evaluation Association guiding principles state: Evaluators display honesty and integrity in their own behavior, and attempt to ensure the honesty and integrity of the entire evaluation process (http://www.eval.org/p/cm/ld/fid=105).
The Code of Federal Regulations defines misconduct in 42 CFR, subsection 93.103 as the “fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results.” This section further goes on to define fabrication as “making up data or results and recording or reporting them.” Falsification is defined as “manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record.” Plagiarism is defined as “the appropriation of another person’s ideas, processes, results, or words without giving appropriate credit.” It is specifically stated that “[r]esearch misconduct does not include honest error or differences of opinion.” (Part III – Department of Health and Human Services 42 CFR Parts 50 and 93 Public Health Service Policies on Research Misconduct; Final Rule). This definition influences the work that the US Food and Drug Administration (FDA), The Office of Research Integrity (ORI), and the Office for Human Research Protections (OHRP) perform on behalf of research integrity.
Fabrication is the making up of data or results and recording or reporting them. In the case of fabrication, no data existed and it was in some way created by the researcher. This would happen, for example, if a researcher needed to obtain data from 250 subjects but could only get 200 and decided just to create the rest of the data rather than attempt to recruit more subjects.
In falsification, on the other hand, the data exists, but it is somehow altered. Research material, equipment, or processes may be manipulated. So, a scale could be altered so that it weighed all specimens as 10 grams heavier than they actually were at the beginning of an experiment so that when the scale was properly calibrated for the end of the experiment, all specimens would weigh at least 10 grams lighter than they had been at the outset of the experiment. Data or results can also be changed or omitted so that the research is not accurately represented in the research record. For example, if a new procedure is under investigation, the researcher might choose not to use test results that reflected badly on the procedure.
The definition of plagiarism in research comes from the concept of plagiarism developed in writing and academia in general. It refers to using another researcher’s ideas, processes, results or words without appropriate citation and referencing. This could happen if a researcher uses a model that he or she was introduced to years earlier in an unpublished paper of a fellow graduate student. While it is perfectly appropriate to extend models that have been developed by others, the researcher cannot take and present the model as an original concept if it is not.
Misconduct may occur at any stage of research, beginning with the research proposal, through the conducting of research, into the reviewing of research or in the reporting of research results. In many cases, the misconduct will occur in more than one stage. It is not uncommon to see misconduct occur while the research is being conducted and then again during the reporting of results. Research misconduct may arise from a lack of knowledge, carelessness, or from falsification. In general, for true misconduct to occur, according to the Code, it cannot arise out of honest error or difference of opinion.
Research misconduct may also be committed by any member of the research team and it may not necessarily be known to the principal investigator on a study. There are many motivations to commit research misconduct and these motivations can differ depending upon the job of the person who commits the misconduct. So, while a principal investigator may commit research misconduct to increase publications or secure grant funding, a postdoc or fellow may commit research misconduct to hurry results. Some individuals may commit research misconduct out of laziness or forgetfulness. Others may commit it because of a lack of understanding of why portions of the protocol are important.
It is important that evaluators and researchers work to promote research integrity. Understanding what constitutes research misconduct is important for everyone in the field. The vast majority of evaluators and researchers work diligently to protect participants and conduct projects with integrity and ethics. When others do not, challenges may be created for all of us.
Printed with permission from Solutions IRB
Scientific merit is an essential component of the design, implementation, evaluation, dissemination of initial and/or subsequent research outcomes.
The Origins of Scientific Merit
The origins of scientific merit lay within each scientific discipline’s desire to build the most robust knowledge base possible. However, within the last century, serious research abuses have given rise to documents like the Nuremberg Code and the Helsinki Declaration, which also specifically speak to the issue of scientific merit.
The Nuremberg Code arose out of the Nuremberg Trials, when Nazi doctors were tried for experiments conducted on both prisoners and concentration camp victims during the Second World War. And while, arguably, some useful information did result from some of the Nazi studies, such as the work they did on hypothermia. None of the studies involved any form of consent, and much of the research they conducted would be considered useless by the scientific community in general. For example, studies attempting to change eye color with the injection of dye into the eye. Because of the significant abuses of the Nazi doctors, the Nuremberg Code states that experiments should yield scientifically useful results. Hellstroem and Jacob (ref 68) point out that scientific merit “should be judged by researchers from disciplines or research orientations outside of that of the research being considered. . . Scientific merit is also an issue of whether a scientific field is ‘ripe for exploitation’ in terms of technology. Thus, scientific merit is founded on the value of the proposed research to the rest of science and technology. “(ref 68, p. 388) research should not be engaged in on a whim or to repeat what is already known; the anticipated results of the study must contribute to science or benefit the field or discipline, and experiments must be conducted only by scientifically qualified persons. (ref 1) (ref 2)
The Helsinki Declaration also addresses scientific merit. It states that research must conform to generally accepted scientific principles. The design and performance of each experimental procedure should outline an experimental protocol which should be reviewed by an independent committee. Scientifically qualified persons should conduct research. (ref 3)
Expectations of research with good scientific merit
Scientific merit, in the most basic sense, looks at whether or not a study represents good science. For any study to have scientific merit, it must contain the following components: address an area of importance to the discipline, utilize established scientific principles, exhibit alignment within the study, demonstrate the proposed scientific knowledge gained from the study, and involve appropriately trained researchers.
Addresses an area of importance to the discipline
There is a crucial difference between what may be of interest to laypersons, practitioners, and scientists. So, while an individual may believe that a particular problem is worth studying because of casual observations made at work or in daily life, scientific merit requires that the problems investigated are considered valuable to the broader scientific community. The problem the researcher seeks to investigate comes from the appropriate academic literature. In turn, what the researcher is studying should extend the current literature in some way. Thus advancing academic disciplines.
The study must address a problem that is supported by the scientific literature and is important to the scientific community in general. Hellstroem and Jacob (ref 68) point out that scientific merit is not only judged by scientists within the field of the researcher but “should be judged by researchers from disciplines or research orientations outside of that of the research being considered . . . . Scientific merit is also an issue of whether a scientific field is ‘ripe for exploitation’ in terms of technology. Thus, scientific merit is founded on the value of the proposed research to the rest of science and technology. “ (ref 68, p. 388)
Utilize established scientific principles
How research is conducted within the scientific community has evolved over a considerable period of time and what might have been considered acceptable even 25 years ago, may not be acceptable today. How a researcher conducts a study should always be based upon the accepted methods or best practice of science in general and within the researcher’s particular field specifically. Research should conform to established scientific principles and procedures.
The researcher, whether designing a quantitative or qualitative study, must turn to the literature extant for guidance on the appropriate methods and procedures to be used within that study. While there are advancements and changes in how research may be conducted over time, and, in fact, research into methods is an area of study in and of itself, researchers should not simply make-up how a study is conducted out of thin air. Researchers do not need to invent new methods to carry out each new study, because appropriate methods already exist within the literature—and when an appropriate method does not, that, in and of itself, will be the object of a study.
Alignment within the study
Any research study that has scientific merit is a coherent whole. From beginning to end, the study flows together. It makes logical sense. When we read the research questions, we should know what methodology the study will use, and we should be able to determine what the variables in the study are. The selection of subjects should flow from the methodology. The methods of data collection will make sense as will the data analysis. Everything within the study is linked together. The terminology used in the study will also align. Qualitative studies talk about research in one way and quantitative studies in another.
First and foremost, the research questions must be answerable. They should not be broad social or philosophical problems like “why is there war?” The research questions should be narrow in scope, and they must be nested within the researcher’s discipline. Beyond having answerable research questions, they must also be answerable in the way that the study proposes. The entire study must represent a blueprint for answering the research questions.
Demonstrate how scientific knowledge will be gained from the study.
The outcome of any study should be an increase in scientific knowledge. The findings or results do not have to be a discipline-changing or a paradigm shift, but it must be an increase none the less. A study cannot merely repeat what is already known—either from previous studies or from general knowledge. Some of the most unethical studies that have been conducted violate this principle of scientific merit. For example, the San Antonio Contraception Study examined whether women who were given placebos rather than birth control pills were more likely to become pregnant. (ref 8)
The directive of adding to the pool of scientific knowledge does not imply that for every study something of statistical significance must be found. To the contrary, the results of many studies show us that something does not work, that there is no significant difference, or that the expected outcome does not occur. But, this lack of a finding is, in and of itself, a finding.
Involve appropriately trained researchers
Good science cannot come from individuals who are untrained. The researcher and everyone involved in the study must have appropriate training to conduct the study. This training will include education related to interaction with the subjects in the study, the instruments in the study, and even the statistics used in the study. If a study involves children, the researchers must be able to show that they have appropriate training in the psychological/medical/social/physical development of children or whatever aspect of child development the study involves. If a study involves veterans with post-traumatic stress disorder, then the researchers must show that they have specialized training in working with individuals with this disorder.
If a study involves administering psychological measures, like the Minnesota Multiphasic Personality Inventory, or the Wechsler Intelligence Scales, then the researchers must possess the specialized knowledge to administer the test. If a researcher is interested in conducting research into Post Traumatic Stress Disorder (PTSD) and wants to measure PTSD, the researcher might decide to administer the Clinician-Administered PTSD Scale (CAPS) (ref 69). CAPS is a good scale to make either a current or lifetime diagnosis of PTSD, but it must be administered by a clinician or clinical researcher who has a working knowledge of PTSD or an appropriately trained paraprofessional. If a researcher is investigating preschool settings, the Environment Rating Scales can be used to assess group programs for young children. And while certification is not required to administer the scales, the authors recommend extensive training. (ref 70) (ref 71)
Proposed recruitment and data collection plans should be based on the population characteristics, topic, and expected participation rates. The proposed study should have a sample size appropriate for the data analysis. However, researchers do not have a crystal ball. This may mean the recruitment efforts do not yield the sample size needed. It may also indicate when the data is entered and ready for analysis the expected range of responses is not present. In these instances, revisions to the original plan may be needed. This should be noted in any publications, reports or dissemination efforts. Typical sections for this information may be in the description of the sample, data analysis section and also in the limitations area.
Scientific merit in grants and publication requirements
Scientific merit is a requirement for grant funding and publications. Funding agencies will often mention that the scientific merit of a particular proposal will be considered when making funding decisions. “The scientific merit of the proposal” is among the criteria that The Foundation for AIDS Research (amfAR), uses to evaluate the proposals and letters of intent that they receive. If a proposal does not have scientific merit, it will not be funded by them (ref 72). The Canadian Institutes of Health Research use both Scientific Merit and the potential impact of the research when reviewing research projects for funding (ref 73).
A Humerous Look at Scientific Merit
The parachute study was written to poke fun at researchers who insist that double-blind, placebo studies must be done to ensure that an intervention works. Clearly, no one needs to assign individuals to “parachute” and “no-parachute” groups. [Permission to use the parachute article provided by Dr. Gordon Smith.] (ref 74). We already know that parachutes do work, so there really is no need to create a study testing whether using a parachute as opposed to not using a parachute increases the survivability of a long “gravitational challenge.” There is no need to conduct a study to investigate something that we already know. And there is no need to put people in harm’s way to conduct this research. It’s not the case that research cannot be undertaken on parachutes, but this would not be the way to design such a study—even though placebo studies may be appropriate in other studies, it certainly would not be appropriate in this study.
Scientific Merit is an critical component of the research design, grant funding and application evaluation, and in the dissemination of findings.