Sounds odd (unintentionally literal)

Yesterday I wrote that the expression “achieve a breakthrough” sounded odd to me, and I much prefer the more frequent “make a breakthrough”.

“Achieve a breakthrough” is, however, a robust collocation (after MAKE, ACHIEVE is the most frequent lemma with “a breakthrough” in COCA; and it has a high MI score). So why did it sound so odd to me? Have I somehow just not encountered it much? Well, maybe sorta.

In the COCA data, any form of ACHIEVE occurs immediately to the left of “a breakthrough” 24 times, while any form of MAKE in that position occurs 65 times, a ratio of about 1:3. COCA provides register/context information in its KWIC displays, and this info showed me that of ACHIEVE’s 24 occurrences, 2 were in spoken contexts (a little over 8%), while 18 of 65 MAKE hits were spoken (a little under 28%). So, the ratio in spoken contexts is 1:9. Alternatively, in non-spoken contexts the ratio is about 1:2.

IMG_0237

I was in a speaking/listening skills oriented class, and we were talking about “achieve a breakthrough”. Perhaps the expression sounded odd to me because it’s not an expression I hear much, even if it would be unremarkable to me in a written context. That is, perhaps it is primarily an expression of written English, but not spoken, and thus it struck me as out of place or something.

I don’t actually think COCA is the best source of data about spoken English, but this sort of thing could be explored in other, maybe more suitable corpora.

Title note: when I said it sounded odd to me, I meant in a broad sense it seemed odd to me. Now I realize I may have unintentionally described how it really, literally, sounded odd to me.

 

 

Advertisements

________ a breakthrough

What is the first word that comes to mind to fill the blank in this post’s title?

If you’re like me, then it’s “make”. Obviously it doesn’t have to be “make”, but that’s what came to my mind.

We are at the beginning of our Autumn term here, and in chapter 1 of the coursebook there were a few vocabulary/collocation activities. Among them were these patterns:

  • _________ progress / a sacrifice / a profit
  • _________ your ambition / success / a breakthrough

The students had a bank of words to choose from to fill in the blanks, and this bank included make and achieve. Now, looking at the sets of words, it’s clear that make goes with progress / a sacrifice / a profit, while achieve goes with your ambition / success / a breakthrough.

However, “achieve a breakthrough” sounded a little odd to my ears. Not wrong. Odd. One of the pitfalls of constructing a collocation specifically for course/textbook purposes is that it often ignores other, more ‘natural’ collocations; and in this case actually ignored a verb that was in the exercise: “make”! A look at the top hits for “VERB a breakthrough” in COCA:

verb-bt

So, “made” and “make” are the top hits (by far), and then “achieve” in the 3rd spot. Now, of course “achieve a breakthrough” is fine and acceptable, and clearly attested in the data, but why would one teach that collocation without teaching “make a breakthrough” first (or as well)?

Anyway, I was going to wait a couple weeks before I introduced some corpus activities in class, but I decided this was a fine time to introduce and demonstrate a very basic use of a corpus. The two sections using this particular coursebook are relatively high level, and I was pleased to notice some students quite quickly grasping (through their asking of questions about what they were seeing) that corpora offer information about the target language that they wouldn’t get in a dictionary/grammar, and certainly not in their coursebook.

To not mention*, it just feels right

@anthonyteacher has a great post at his site discussing the patterns “not to VERB” and “to not VERB”. He writes about his students’ reactions to the constructions, his own view, and some findings from Google N-grams and COCA. You should read his post in full.

I basically agree with everything he says, with one point he makes that I would like to extend a little bit. So I’d like to highlight this paragraph from his post, and especially the statement I put in red:

All of this data tells me several things. First, “to not” is on the rise, most likely due to the fact that the ability to separate an infinitive has become more accepted and “to not” has probably rolled in through a snowball effect. Second, the placement of “not” does not necessarily imply emphasis, as can be seen in the sentences above. Third, while my speech may make some of the older generations shake their first with anger, possibly telling me I am killing English, I can now reply confidently that my speech is the vanguard of an English where “not” is as placement-fluid as “they” is gender-fluid. My speech may be a speech that is likely to boldly go where few have gone before. Or to not boldly go, because language change is really unpredictable, and this is just a tiny thing.

I chose to highlight this section because I felt that sometimes my own choices regarding placement of “not” are definitely, if not necessarily, done for emphasis, but after thinking about it I don’t think it is a matter of placing emphasis per se. Rather, it is about restricting possible meanings/uses.

Let me explain. Here are two partial lines from COCA (query terms: not to mention):

1) … He would talk only if I promised not to mention he lived in …

2) … But tours and marketing materials, not to mention data on the average student, won’t tell you if that college will …

In the first line, I, personally, would probably phrase that as “to not mention”, though not necessarily. The point is that both constructions feel natural to me. However, I can’t imagine myself saying that about the second line. To me, and the way I’m processing these constructions, the first line’s meaning is straightforward, but the second line’s meaning is based on my understanding of “not to mention” as a fixed or partially fixed expression in this instance.

In this case, the construction is not simply negating the mentioning of something (in fact, the thing in question is explicitly and necessarily mentioned/understood). Indeed, the online Cambridge Dictionary, for example, defines “not to mention” as a phrase used when you want to emphasize something that you are adding to a list.

So, generally speaking, I process “to not VERB” as basically interchangeable with “not to VERB” (with a personal preference for “to not VERB”) when the meaning is straightforward (i.e. negating the verb). But “not to VERB”, perhaps because of it’s associations with certain fixed expressions, seems to me to have a broader range of usage. Something like this:

“not to VERB”: can negate the verb or have idiomatic/figurative meaning and usage

“to not VERB”: restricted to negating the verb

Here is a sample of COCA lines from @anthonyteacher ‘s post:

cocatonotkwicacademic

All the “to not VERB” uses here have meanings that can be understood as simply negating the verb. I suspect it would be this way throughout all the lines.

At least, until the language changes some more 😉


If I have said something glaringly, obviously wrong please tell me. Or if you have evidence of “to not VERB” used in an idiomatic/figurative way, please share it. Or if you have better choices for terminology etc. etc. etc. …

“I agree —.” Referencing with COCA

In the past couple weeks I have had reason to pull up frequencies or concordances in response to student questions on several occasions. For a teacher this is a pretty basic but useful application of corpora.

For instance, this past week in a business-focused ESP course we were reading through a role play which used the expression ‘I agree wholeheartedly’. A few types, or maybe lines, of questions arose. Is ‘wholeheartedly’ very common? Can I say ‘I wholeheartedly agree’? Can I say ‘I agree completely’? Can I say ‘I agree perfectly’? What if I agree but not that strongly? These didn’t arise all at once, but came about through discussion (some after I pulled up results in COCA).

Notably, to me, meaning was not an issue at all. All of the questions were regarding usage.

I ran a quick and relatively simple search on COCA using: I agree [r*]

([r*] is the PoS code for adverb)

Here are the first 20 results:

i-agree-sized

Across the entire corpus we can see that I agree wholeheartedly is the second most common formulation of the I+agree+adverb pattern. I agree completely is by far the most common. The top 4, 6 of the top 10, and 8 of the top 20 seem to be synonymous expressions of the sense of ‘100% agreement’.

We did not have time to go through each expression but the students said they appreciated seeing several examples of alternative phrasings. For the expression I agree perfectly, I actually thought it sounded a little strange when my student asked if it was ok, but here it was, attested. Only 2 hits, but after opening the concordance lines I saw how it was used and my hesitation about it evaporated. For me, this was a good reminder of how corpus reference can help me (and by extension my students) avoid over-generalizing from my own intuition.

The 9th most frequent hit, I agree in, is also notable, and my students asked specifically about it. So we took a look at its 6 concordance lines. 4 of the lines revealed the expression I agree in part and 1 revealed I agree in general; so they found some useful expressions for expressing various degrees of agreement to go along with hits such as (13) I agree basically, (18) I agree strongly, or (19) I agree somewhat.

We also briefly looked at the query: I [r*] agree

This resulted in a very different frequency list:

adverb-agree-sized.jpg

There are more overall hits for this pattern. I totally agree is the top hit, with I completely agree the second. Interestingly, I certainly agree is the 4th most frequent hit despite I agree certainly occurring zero times in the corpus. In fact, the I+adverb+agree pattern revealed several expressions that had no lexical correlates in COCA to the I+agree+adverb pattern. This wasn’t the point of the referencing in class, so we didn’t dwell on it and focused on the variety of expressions that can be used, but, for me, it was definitely food for thought. Agree?

That rule that you don’t know that you know you know…is not a rule

A couple weeks ago a tweet from Matthew Anderson of the BBC blew up a little bit. The tweet had an excerpt from Mark Forsyth’s book The Elements of Eloquence: Secrets of the Perfect Turn of Phrase. The excerpt asserted that “adjectives in English absolutely have to be in this order: opinion-size-age-shape-colour-origin-material-purpose Noun. So you can have a lovely little old rectangular green French silver whittling knife. But if you mess with that word order in the slightest you’ll sound like a maniac.”

The tweet has more than 51,000 retweets and nearly 75,000 likes. It has also spawned articles like the following:

And a bunch more.

This is too credulous. Fortunately, Language Log was asked about the claim and explained how adjective order is not so simple. And many folks have put forth “big bad wolf” (size-opinion) as a clear violation of this so-called rule.

Forsyth responded by saying “big bad wolf” is actually a case of ablaut reduplication (think of words like zig-zag or tick-tock, they are never zag-zig or tock-tick) and, I think, he’s saying that this means it’s not subject to the rule that “adjectives in English absolutely have to be in this order”. If that’s the case, then absolute ≠ absolute.

But even if we let there be an exception for ablaut reduplication and decide that the ‘absolute’ rule for adjective order doesn’t apply in such cases, does the rule hold otherwise?

We can look for evidence.

The Language Log article mentioned above noted several hits in COCA for patterns that don’t fit the rule, including “big ugly”. I got on COCA myself to see what some of the concordance lines looked like, and ran a few other searches as well (other than “big ugly”, in the searches I purposely used words found in the example from the excerpt). Here are a few of the many, many hits that violate the ‘rule’:

Adjective pattern: size-opinion; Search terms: big ugly

  • This is a cute little town with a big ugly fire problem right now
  • On the platform was a big ugly man who talked loud and waved his arms around until his face was red
  • The construction of the big ugly office building was going very well

Adjective pattern: material-origin; Search terms: silver [j*] [nn*]

  • We gave her a small silver Ethiopian cross because we knew she was a devout Catholic

Adjective pattern: color-shape; Search terms: green [j*]

  • Inside the matchbox were nine little green triangular pills
  • One torn green triangular corner of the murdered Slap-of-Wack bar blows across the desert
  • he takes his pass from its slot in the green rectangular pass book before adding his name and time of departure to the long list

Adjective pattern: age-opinion, size-opinion; Search terms: [j*] lovely

  • seemed infinitely more beautiful for the presence of this new lovely white figure in their midst
  • At the doorway to the kitchen stood a tall lovely woman with long black hair and wearing a pink linen dress

None of these constructions sound maniacal.

I think the pattern Forsyth describes is a general preference for order that is usually going to result in good patterning, but it’s definitely not an absolute rule. The rules governing adjective order are messier than that (see the Language Log post). I only felt the need to comment and remark on it because I noticed several ELT-related accounts passing it around and I thought “Really?? Did anyone check the data??” This post doesn’t have anything really in the way of using corpora in the classroom or anything like that, but hopefully it can be a reminder that corpora are excellent reference tools (especially to check sketchy ‘rules’).

**Update**

I realized someone might object to the “silver Ethiopian cross” example since an Ethiopian cross is a particular style of cross. So here’s a different example from COCA that demonstrates the same pattern of material-origin (search terms: wooden [j*] boat):

  • a traditional wooden Bahamian boat hand-built in the early 1960’s

Playing by ear and verbing by body parts: Using COCA to discover usage

Last week, a passage in the textbook for one of my classes had the expressions ‘he tunes it by ear’ and ‘they learn to play by ear’. For many of the students (not all, tho) this was a new expression, ‘[verb] by ear’, but in the context of the passage most of them figured out, on their own, that it meant the people tuning or playing instruments were able to do so without any sheet music or other aids. The textbook had a question using the expression ‘play by ear’ in the chapter review, so I felt it would to pay to explore the pattern a little.

So we had this pattern, ‘[verb] by ear’. What does ‘by ear’ really mean? And what verbs could fit into that slot? Several students agreed that it meant using one’s ears to do something. That was a fine start. But would it make any sense to say ‘sleep by ear’ or ‘imagine by ear’ with senses of using one’s ears to sleep or using one’s ears to imagine? No, not really. I asked my students if they could replace ‘by ear’ with a different expression in the phrase ‘they learn to play by ear’. Pretty quickly, they came up with ‘they learn to play by listening’. That’s more like it. So ‘[verb] by ear’ means that someone is using listening skills to [verb].

At this point, I directed the students to open up the newly-mobile-friendly COCA (every one of my students has an iPad) and enter the following query under the List function: VERB by ear.

play by ear 1
Search area highlighted in red

 

play by ear
VERB by ear: top 20 frequency list

Looking at the frequency data, it was immediately clear to them that ‘play(s)/playing/played by ear’ is, by far, the most frequent formulation of this pattern. ‘Learn(s)/learning/learned by ear’ is also quite a bit higher in frequency than other formulations of the pattern, most of which have only 1 or 2 occurrences in the corpus (note: ‘birding by ear’ is the title of a book for birdwatchers about learning to recognize birdsongs, and references to this book in the corpus make this formulation appear higher in the frequency list than one might otherwise expect).

Using this data my students and I discussed how the pattern ‘[verb] by ear’ can be used with a lot of different verbs, but in general they are safe knowing that it’s most often used with forms of play and learn.

For homework, students were assigned one phrase from a set of phrases: ‘[verb] by mouth’ or ‘[verb] by hand’ or ‘[verb] by foot’ or ‘[verb] by eye’; and they were told to do a similar analysis to what we did in class. That is, explain how the expression is used and find out what verbs commonly fit into the empty slot. Also, they were to find 3 example sentences (using concordance lines from COCA, an online learner’s dictionary, SkELL, etc.).

In our next class students who were assigned the same phrase sat together and checked/discussed their results, and then they taught members of other groups about their assigned phrase.


Overall, this activity went over very well. A big part of that is due to the BYU Corpora interface redesign. It’s so much cleaner visually, easier to navigate, much easier to search, less intimidating, and so much more mobile-friendly.

Importantly, considering other class and content needs, this did not take up a lot of (in-class) time. The activity on the first day took around 15 mins (in a 90 min class), and the activity on the second day took around 25 mins. Within the activity there were elements of individual, group, and whole class work, which helped keep it from seeming tedious.

Of course, there were other options for how to exploit the corpus, too. For example, instead of assigning expressions for the students to research, we could have had some fun by having them find expressions of the ‘[verb] by [body part]’ variety themselves, maybe discovering some surprising combinations. I suspect you can think of a few ways to positively tweak this kind of activity, too.

 

Visual appearance of corpus resources is a barrier to uptake in ELT

*This is not a rigorous examination of appearance or barriers to uptake of corpus resources. It’s just a few disorganzized thoughts, externalized.*

Even though corpora are widely used in materials development (dictionaries, grammars, etc.), classroom use of corpus resources is under-developed. If, like me, you see a lot of pedagogic benefits in corpus resources you might describe them as underused.  I actually think this underuse is quite rationale, for several reasons, and I sympathize with teachers who choose to avoid “corpus work” in their classrooms.

IMO, one of the biggest barriers to uptake is that so much software, the presentation of data, and the specialized terminology are daunting, seem inaccessible or irrelevant, feel time-consuming, and generally seem to not be user-friendly. The sense of not being user-friendly is, I think, a huge stumbling block for those who would otherwise be interested in experimenting with corpus resources.

Whether the resources are actually user-friendly or not is, I think, a subjective question. But anecdotally, I think a lot of teachers feel this way. And to be honest, I feel this way about quite a few of the resources out there, even ones I like and use. It took quite a bit of practice and training just to start to feel any sense of confidence using them, whether it was hands-on use of a corpus and analyzing language myself or integrating into class corpus-based software/information that I needed to make pedagogic decisions about. I dont’t think this difficulty was just because the work is complex, but that a feeling of user-unfriendliness creates a psychological aversion to engaging with the resource(s), as well more conscious decision-making along the lines of thinking that the resources are too complex/opaque/irrelevant for one’s teaching or for learner-use.

There’s no easy fix to this. Nor should there be. But there are some things that would improve the perception of the user-friendliness of corpus resources. I believe one of those things is simple layout and design, the way the resources look. Some corpus resources have a steep learning curve, but poor visual design and layouts makes them appear even more difficult and inaccesible. So teachers avoid trying them out and/or sharing them with students  That is why I’m very happy about the update to the BYU collection of corpora. EFLNotes conducted a brief interview with Mark Davies and obtained a few screenshots of the new-look interface. I think it looks great. I think it looks more inviting, modern, and accessible.

Not only the BYU corpora, several (maybe most?) corpus resources might be looked upon more favorably in teacher-circles (and experimented with by teachers) if they modernized/updated their appearance. I know this seems like a superficial thing to worry about, but it is a real perception. Would better visual design suddenly make corpora go mainstream in ELT? No, that’s simplistic and fanciful. Would it improve the perception of their user-friendliness? I think so. Of course, there are many other ways (and perhaps more substantial ways) that corpus resources could be designed that would make them more pedagogically attractive to teachers. Still, improving the appearance of corpus resources would be a step forward, I think.

I am *VERY* interested to hear if anyone has any thoughts about this or related issues (ease-of-use, visual presentation, layout, uptake barriers, etc.).

A final note: another nice aspect of the BYU corpora update is that it will make the interface mobile-friendly. This is also important for uptake, IMO, and a topic I plan on addressing in a future post.