In order to examine the differences between the legal reasoning of first and third year law students, this study used a new combination of research techniques drawn from both previous legal and social science research. From the legal research area, the central role of the law school case method of study led to the use of essay questions to collect data about legal reasoning. The definition of legal reasoning, and the essay scoring scale presented in this chapter, likewise were tailored to measuring the legal reasoning of persons who had studied using the case book method. From the broader area of research on thinking, reasoning, and problem solving, the technique of using thinking aloud protocols was drawn. The basic idea was that thinking aloud protocols would capture data about the students' thinking processes at the same time as they created their essay answer data. Thus there would be from first and third year law students both process data, the thinking aloud protocols, and product data, the essay answers, which could be analyzed for differences. The rest of the data collection and analysis would be built around the core of product and process data.
Since this study breaks new ground, the design selected emphasizes exploration of the students' legal reasoning rather than an in depth replication of an existing line of research. The emphasis on exploration led to multiple approaches both for data collection and data analysis. For data collection, four kinds of data were collected from both first and third year law students: product data in the form of essay question answers; process data in the form of thinking aloud protocols; general data from the observations of the researcher as the students worked on the questions; and background data collected to double check the equivalency of the two groups of students who were being compared. For data analysis, the methods included: visual inspection and statistical analysis of numerical data like the essay answer scores and the background undergraduate grade point average data; computer searches of the essay answers for the frequency with which key terms were used; and the researcher's individual scanning of both the thinking aloud protocols and the researcher's notes to determine if thinking patterns could be isolated and compared.
The starting point for explaining the design will be the details of how the students were selected and grouped for study. Then the data collection and data analysis methods will be presented in more detail. The central concern of each step in the design will be to assist in answering the question of how the legal reasoning of first year law students compares to that of third year students.
Population.
The population to be sampled consisted of students at an accredited midwestern law school. Selecting this population had both benefits and drawbacks when compared to the populations used in other studies. On the benefit side, selecting students from an accredited, but not "top twenty", law school provided an opportunity to study students from a population that has not been represented in prior studies. Bryden's students, for example, were only from top law schools. In addition, the particular midwestern school selected had a much higher than average number of required courses. Thus the three year curriculum was more uniform than at schools which allow more elective courses. This increased uniformity lessened the uncontrolled variation in the educational experience of the third year students who were selected to participate.
A more subtle curb for uncontrolled variation was that the students at the selected institution were all "part time" students. Part time students take fewer credit hours per semester but still complete the program in three years due to the availability of summer sessions. The course sequences that they take are the same as would be taken by full time students. However, the commuting or employment responsibilities typical for a part time student often shield the part time student from some informal educational influences. The full time student is more likely to be present at a law school outside of required class hours. This allows full time students the luxury of an opportunity, for example, to have a coffee break in the lunchroom with a professor. A part time student dashing from the office to class would not get that same opportunity. This leads to the somewhat ironical result that the part time students selected in this study may have been more representative of the results of traditional legal education than full time students would be because the part time students have had less exposure to informal sources of legal education.
The drawbacks to using this population include those associated with selection at only one institution and with the use of volunteers. Each of these drawbacks will be discussed separately in the following paragraphs.
The ability to generalize from the results of this study will be qualified by the decision to select students at only one institution. However, in addition to the benefits mentioned previously, this decision also appears reasonable in light of the results of Bryden's study. Bryden found that the results were "suggestively consistent" from school to school (1984, p. 500). This finding is not surprising in light of the uniformity in content and method in traditional legal education. Therefore restrictions on the ability to generalize may not be as severe as they might first appear.
The drawbacks of using volunteers are well documented in the literature. However, given the nature of the highly intensive type of testing involved in this proposed study, it appeared that there was no reasonable alternative to the use of volunteers, at least if student rights are to be adequately protected.
Informed Consent.
Since the principal researcher is himself a law professor who could be viewed as an influential person regarding future decisions affecting student participants, special care was devoted to obtaining the free and informed consent of participants. Fortunately the research site had two characteristics which greatly assisted the protection of student participants. The first characteristic was that almost all courses at the law school use anonymous grading; the only exceptions were courses in which the principal researcher could easily avoid involvement. The second characteristic was that all participants had a minimum of twenty semester credit hours of law school and thus already had studied the law of Torts. Torts includes the study of the doctrine of informed consent and thus participants in this study were better equipped to make a free and informed decision.
The primary method of insuring informed consent was through the use of the consent form which accompanied the participation invitation letter (Appendix "A"). The consent form contained numbered paragraphs setting forth the following points. Paragraph 1 summarized the purpose, procedures, and duration of the research. Paragraph 2 expressed the participant's agreement both that the research has been explained and that the participant understands it, including the following two risks. One risk is the stress associated with the taking of a three hour essay examination. The other is the possibility that the results of the essay examination may not be in accordance with the participant's own self image. Paragraph 3 explained that the principal researcher retained the right to grade participants in future law school courses if those courses use the school's anonymous grading procedure. It also explained that the principal researcher would not join in law school decision making concerning an individual participant unless the participant made a written request asking the principal researcher to join in a particular decision process. Finally, paragraph 3 indicated that prospective participants should not sign the consent form unless they understood the role of the principal researcher as it was set forth in paragraph 3. Paragraph 4 stated that the participants were free to discontinue participation at any time without recrimination. Paragraph 5 stated that all results would be treated with strict confidence regarding the identity of any participant but that individual participants would be able to obtain their own results if they so desire. Paragraph 6 stated that participation involved no guaranteed benefits to the participants other than the agreed fee paid for their participation. Paragraph 7 stated that signing the form indicates that the signer freely consents to participate. The consent form also included the phone numbers of two persons from whom additional information could be obtained by prospective participants.
Sampling.
Volunteers were solicited from students who were in either their third semester or in their eighth or ninth semester of law school. An invitation was mailed to each student followed by classroom announcements. The letters (Appendix "B") included a tear-off section on which students could volunteer by writing their name, address, and telephone number. Participation was encouraged by offering twenty-five dollars to each volunteer who was chosen for participation.
Forty-eight students volunteered. Of the forty-eight volunteers, eight were not eligible for participation: five had cumulative grade point averages below a 2.0 and three had completed between forty-five and sixty credit hours and thus did not qualify for inclusion in either the first or third year group.
Each of the remaining forty volunteers were sent a letter (Appendix "C") asking them to schedule a data collection session at a time that would be convenient for them. Using letters to invite the students avoided the possible coercive effects attendant to an in person solicitation from a person, in this case a law professor, who could be viewed as being in a position of authority over the student. Letter invitations also standardized the information given to each student and thus avoided the possible confounding effects of information that might slip out in a conversation with the student.
Sessions were scheduled at all times of day on all days of the week to accommodate participants. Nevertheless, of the forty eligible volunteers, only thirty-one took the additional step of signing up for a session and this number was reached only by extending the sign up period three weeks longer than originally planned. Fortunately, of the thirty-one students, sixteen were first year and fifteen third year. One first year student was used for a trial run of the data collection process and this left an even number of students in the two groups. Since about one hundred ten first year students were eligible, between fourteen and fifteen per cent actually participated. For the third year students, the participation rate was between six and seven percent since fifteen participated out of a total eligible pool of about two hundred thirty.
Data was collected at individual administrations of four essay questions (Appendix "D") during a three hour session. The four essay questions were drafted by the principal researcher. All of the questions also were reviewed by a panel of two law school professors. Each of these professors had well over ten years of legal education experience. This panel approved the questions as to form, substance, and likelihood of eliciting "legal reasoning" as defined in this study.
The legal subject matter covered by the questions was confined to the areas of substantive law covered in the first year of law school: contracts, crimes, civil procedure, property, and torts (torts are civil wrongs, other than breach of contract, such as defamation and civil fraud). This offset the superior substantive knowledge of the third year students with the more recent exposure of the first year students to the materials.
Previous experience with law school examinations indicated that three hours would be more than enough time to complete the four questions, even with time allowed for verbally describing one's thoughts while writing. Only one student came close to using the full three hours and even that student did not complain of time pressures.
The essay questions were drafted to present problems for which there exists no single, easy answer. This was done in the hopes of eliciting more legal reasoning by avoiding automated processes in the same way as Johnston and Afflerbach (1985) did in their reading comprehension study. However, the absence of a single, simple answer did not mean that the questions were impossible or even very difficult. The questions were well within the capabilities of even the first year law student participants.
As previously noted, a three hour block was scheduled to accommodate the student's schedule. Data was collected in two ways. First, as in Bryden's study, students were asked to write answers to each of the four essay questions. Second, the students were asked to verbally describe what they were thinking while they were writing the essay answers. A videotape camera and recorder were used to capture these verbal responses. No time limit was given for each question but the students were asked to complete the four questions within the three hour block and a clock was provided to assist the students in keeping track of the time. Both the number of questions and the time allowed would be typical for a law school examination.
To make data collection seem more like a lawyer's typical office work, and less like a law examination, students wrote their answers while seated at a desk in an office. Writing was done on yellow legal pads rather than examination bluebooks. Students were told that could take a break at any time to get a cup of coffee, use the washroom, etc.
Prior to beginning writing, each student signed a statement attesting to the fact that the student had not discussed the essay questions with any of the other participants, either directly or indirectly. Each student was also given warm up instructions (Appendix "E") and two scrambled letter problems so that they could practice the thinking aloud procedure. The two scrambled words used, in the order given, were "koro" (rook) and "npepha" (happen). After the students had worked through these words, they were asked to select one set of questions from forty sets spread face down on the desk. The sequence of questions in each set had been systematically varied so that the question that was first in the first set became the second question in the second set and so forth. The resulting forty sets of questions then had been shuffled. By selecting one of the face down sets, the student received a randomly assigned sequence of questions.
After completion of the essay questions, participants were asked to fill out a questionnaire (Appendix "F"). This questionnaire collected data on participants' legal experience outside of the required courses. Included was information on elective courses taken, employment experiences if legally related, and close associations, if any, with attorneys. This information served to give some indication of what other factors may be affecting the performance of participants.
The students' written responses and their verbal comments were analyzed in two stages. First, the written answers were anonymously classified by a panel of eight coders, each of whom had experience with the law. Second, the principal researcher examined the students' written answers in conjunction with the students' concurrent verbal comments and the researcher's observation notes in an attempt to further understand what the students did and did not do and why. The goal of both stages of the analysis was to answer the primary question of how the legal reasoning of students who have completed almost one year of traditional legal education compared to the legal reasoning of those students who have completed almost three years.
Both stages of data analysis followed the work of Levi as summarized in Levi's previously quoted definition of legal reasoning. To assist the evaluation of students' legal reasoning, that definition was used as the basis for a classification scale (Appendix "G") which was given to the panel of eight coders who evaluated the student answers. The scale's classifications were defined according to the type of authority used by the student and whether the facts of the question or previous legal case were used in the student's answer. The types of authority recognized included nonlegal authority, legal rules, and prior case law such as a reported decision of an appellate court.
This approach to coding was chosen as the best way to indirectly assess the extent to which students were using Levi's reasoning by analogy. If a student were to use all three of Levi's steps, the student would have to mention the facts of the previous case, the facts of the present problem, how the two sets of facts compared and contrasted with each other, the disposition of the previous case, the legal rule which could be developed from consideration of both sets of facts, and the disposition for the present problem. Thus, the extent to which a student worked with both the facts of the cases and of the problems would be one indication of reasoning by analogy. If, on the other hand, a student mentioned only a common sense or legal rule, it would indicate deductive reasoning that stops short of using all of Levi's steps.
The coding classifications forced the coders to separately consider whether the answer was based upon the use of facts and whether it used a given type of authority. Breaking the coding process down into these steps was an attempt to achieve higher reliability in coding than if the coders were simply given Levi's definition and asked to evaluate the extent of its use in an answer. Breaking the coding process into separate steps for facts and authority also attempted to achieve higher reliability than would have been achieved with the use of issue counting in the manner done by Bryden.
Letter combinations, such as "NLAA" (nonlegal authority alone) or "CF-C&P" (Case with Facts of the Case & this Problem) were used to indicate both the type of authority being used and whether that authority was used in conjunction with facts or not. The combinations used were: NLAA (Non Legal Authority Alone); NLAF (Non Legal Authority with Facts); RA (Rule Alone); RF (Rule with Facts); CA (Case Alone); CF -C or P (Case with Facts of Case or this Problem); and CF - C&P (Case with Facts - Case and this Problem). The use of letters instead of numbers helped the coders remember what each classification was and it also avoided the inference that one classification was necessarily better than another. The principal researcher met with each coder and used a practice question and answers (Appendix "H") to assure that the classifications were applied correctly by the coder. The practice answers were obtained from student responses to a classroom exercise which was otherwise unconnected to this study.
First stage data analysis.
The first stage of data analysis was classification of the student answers by the eight coders mentioned above. Each of the four essay question answer sets was randomly assigned to two coders who worked independently. Thus each coder classified thirty essay answers. Prior to giving the answers to the coders, each answer to each question was assigned a random identification number. Coders therefore classified the answers without any knowledge either of the student writer's identity or of the group, first or third year, to which the student belonged.
After the answers had been classified and returned to the principal researcher, the classifications were converted to numbers to assist analysis. The numbers were assigned in the order the classifications appear in the classification guide (Appendix "G"). Thus "NLAA" (Non Legal Authority Alone) became "1", "NLAF" (Non Legal Authority with Facts) became "2", "RA" (Rule Alone) became "3", "RF" (Rule with Facts) became "4", "CA" (Case Alone) became "5", "CF-C or P" (Case with Facts of Case or this Problem) became "6", and "CF-C&P" (Case with Facts of Case and this Problem) became "7". The participant's final score for each question was the total of the two numbers resulting from the coders' classifications. Thus is if a particular essay answer received a "5" from one coder's classification and a "4" from the second coder's, then the participant's final score for that answer was a "9". Higher numbers were assumed to indicate more complete conformity with Levi's definition of legal reasoning and thus the classification scale was assumed to be an ordinal scale.
Since the classification scale has not been validated in prior research, the scale also was recoded into two dichotomous variables which reflect key elements of Levi's definition of legal reasoning. The two variables were based upon whether facts were used in the answer or not and on whether a rule (common sense or legal) or a case was used. The recoding was done with the researcher acting as a tie-breaker if the two scorers' classifications would lead to different results. Since both the seven point scale and the dichotomous variables were based upon Levi's definition, it was expected that both coding methods would lead to similar results. However, the recoding was done to allow that assumption to be verified.
Student classification scores next were sorted back into first and third year student groupings. The scores were then analyzed in three ways. First, the scores were graphed for visual examination of how each group of students performed on each question. Then mean scores for each question were computed and graphed for both the first and third year student groups. Finally, interpretation of the scores was assisted by the computation of several statistics. To examine how well the classification scales worked, a Spearman-Brown scorer reliability estimate was computed as well as correlations between the original seven point scale and the dichotomous scales produced by the recoding. To examine the results obtained from the classification scales, a Student's<MI> t <D> test of significance for the differences between the student group means on each question was computed as well as an analysis of covariance using undergraduate grade point averages and Law School Admission Test scores as covariates.
All analysis of the data, whether informal visual inspection procedures or formal statistical tests, were conducted based upon a null hypothesis of no difference between the two groups. The alternative hypothesis was that a third year student group mean score on a question would be greater than that of the first year student group. Due to the small sample size, a significance level of<F128M> a <F255D>= .05 was chosen for the statistical tests. The power of the test (Hayes, 1973, pp. 419-420) was .94 based upon a sample size of fifteen when the percentage of variance accounted for was .25 or more. The further assumption for the power computation was that the approximation was based upon a normal distribution.
To gain more information about the participating students, and to determine whether the writing ability of the first and third year students was similar, the students' answers were analyzed using the computer program Grammatik III (Wampler, 1988). This program indicated the reading level of what the students wrote. The idea behind using the test was that, if the reading levels would be similar, then there would be one less source of variation that could influence coding of the answers.
Second stage data analysis.
The primary goal of the second stage of data analysis was to search for explanations of whatever findings were made in the first stage analysis. The search process proceeded in two steps. First, even as the students worked on the essay questions, wrote their answers, and concurrently made verbal comments, the principal researcher observed their words and actions. While observing the principal researcher sat in a corner of the office behind where the student was working. Notes could therefore be made without cuing the participants as to desired responses. The only prompt that was given was the statement: "Please remember to speak up." Since the students each received the questions in packets in which the questions had been sequenced at random, abbreviated notations were adopted to allow the researcher to make unambiguous references to the questions. Thus, in appendix "D", the four essay questions are labeled: "FF" (free food); "LL" (Larry Landowner); "SNS" (Saturday night special); and "TANK" (Tanker). As indicated by the words in parentheses, the abbreviated notations were selected based upon some of the facts in the essay questions. Second, all portions of the videotapes relating to one of the questions ("FF") were transcribed into typewritten protocols. Both the notes of the principal author and the typewritten protocols were searched for explanations of the first stage findings.
At each step of the second stage of data analysis, guidance was provided by Levi's definition of legal reasoning. The panel of coders used classifications designed to indirectly indicate the use of Levi's reasoning by analogy. The principal author used Levi's definition directly as a guide when observing the students, viewing the videotapes, and examining the typewritten protocols. In addition to personally examining the protocols, the principal researcher used a computer text search program (ZyIndex, 1987) to determine if certain key words were used by the students. Key words were chosen based on the likelihood of their appearance if a student was using Levi's reasoning by analogy. The words chosen included "analogy", "case", "decision", "example", "common law", "rule", and "principle". When any of these words were located by the computer search, the principal researcher re-examined the passage to assure that the material had been not been missed during the researcher's examination of the protocol.
The second stage of data analysis also provided an opportunity to obtain data or insights not based upon Levi's definition of legal reasoning. Since collection and analysis of concurrent verbal comments went beyond what had been done in previous studies, so also efforts were made to remain open for the possibility of new and unexpected findings. For example, it may have become clear that, contrary to expectations based on previous studies, legal reasoning is learned only very late in law school. First year students might mimic an automated process and only much later realize the underlying legal reasoning implications. Such a finding would be difficult to make based upon examination only of the results of participants' problem solving but it might appear in the more complete data preserved by thinking aloud while problem solving. In any case, such a finding would be an example of a pattern of problem solving that might be discovered in the second stage of data analysis.
This chapter has detailed the design of this study, the types of data to be collected, and the methods of analysis to be used. The essay question data, the thinking aloud protocols, the researcher's observation notes, and the background data all were aimed at collecting data about the legal reasoning of first and third year law students. The visual inspection techniques, the statistical analyses, and the computer searches all were selected as means of comparing the data about the students' legal reasoning. The findings from all this will be presented in Chapter IV.